Co-Creative AI - Weak/Strong Co-Creation

From Weak to Strong Co-Creative AI: The Role of Interactional Coherence Regulation in Human–AI Co-Creativity

Abstract

Research on co-creative artificial intelligence has grown rapidly, yet the term co-creation is applied to systems with widely divergent interactional capacities. Many systems labeled “co-creative” remain reactive, overproductive, or externally constrained, producing outputs in alternation with a human user without demonstrating meaningful interactional regulation. This paper introduces a principled distinction between weak co-creation and strong co-creation, grounded in enactive and interactional theories of cognition. We argue that strong co-creation emerges only when an artificial agent can commit to meaningful action, recognize completion structurally rather than statistically, integrate its contributions through pauses that preserve coupling, and yield space without surrendering agency. We formalize five requirements for strong co-creation and show how they collectively define a new class of co-creative algorithms grounded in sustained contribution to a shared interactional trajectory, recognition of when the system’s own contribution has been sufficiently expressed, and autonomous regulation of when to act, pause, or yield within the interaction.

1. Introduction

Co-creative artificial intelligence systems are commonly defined as systems that collaborate with humans in creative activities such as drawing, music composition, storytelling, and design. Over the past decade, advances in generative modeling, interactive interfaces, and real-time systems have produced a growing body of work in which artificial agents contribute artifacts alongside human users (Boden, 2009; Davis et al., 2015; Karimi et al., 2020; Fischer et al., 2021). Within this literature, systems are frequently described as co-creative when they respond to user input, alternate turns, or generate material intended to complement human contributions.

However, despite the widespread use of the term, co-creation remains conceptually underspecified. Systems that differ radically in how they participate in creative processes are often grouped under the same label (Kantosalo & Toivonen, 2016; Fischer et al., 2021). In practice, this has led to a flattening of important interactional distinctions: systems that merely react to input are discussed alongside systems that exhibit sustained initiative, timing sensitivity, and expressive restraint. As a result, the field lacks clear criteria for what it means for an artificial agent to participate in a shared creative process, rather than simply produce content in proximity to a human partner.

Many existing co-creative systems successfully achieve responsiveness, novelty, and stylistic compatibility, yet still fail to evoke the experience of genuine collaboration. They may alternate turns smoothly, generate high-quality outputs, or adapt statistically to user behavior, but they often overproduce, repeat themselves, or disrupt interactional rhythm (Davis et al., 2015; Oh et al., 2018; Clark et al., 2018). Pauses may feel like breakdowns, and continued output may feel intrusive rather than supportive. Similar concerns have been raised in surveys of mixed-initiative and co-creative interfaces, which note persistent challenges around pacing, turn-taking saturation, and disengagement despite advances in generative quality (Fischer et al., 2021). These systems are interactive in a technical sense, yet interactionally shallow in a human sense.

This paper argues that these limitations are not primarily matters of model capacity, output quality, or interface design. Instead, they reflect a deeper issue: the absence of diachronic interactional regulation. By diachronic interactional regulation, we mean an agent’s capacity to regulate when to act, when to pause, and when to yield space in ways that sustain shared sense-making over time. Research on human communication and joint action has long emphasized that successful collaboration depends on participants’ ability to manage timing, completion, and restraint across extended interactional trajectories (Sacks, Schegloff, & Jefferson, 1974; Clark, 1996; Levinson, 1983). Yet many co-creative AI systems lack the ability to commit to meaningful action, recognize when that action is complete, integrate their contribution before continuing, and yield space without relinquishing agency. Without these capacities, co-creative interaction remains fragile, easily tipping into confusion, silence, or disengagement.

We therefore argue that current literature lacks a crucial distinction: not all co-creation is equal. Some systems are weakly co-creative, offering responsiveness without commitment and participation without restraint. Other systems, by contrast, exhibit the structural hallmarks of strong co-creation: phrase-level coherence, structural recognition of completion, integration through silence, sustained coupling across pauses, and the capacity to yield space intentionally. Related distinctions between instrumental support and participatory collaboration have been noted in studies of creative interaction, though they have not been formalized at the level of interactional regulation (Boden, 2009; Kantosalo & Toivonen, 2016).

We propose strong co-creation as a distinct interactional regime. Strong co-creation is not simply a matter of improved output quality, faster response times, or more sophisticated generative models. Rather, it represents a qualitative shift in how agency, timing, and restraint are enacted within human–AI interaction. In strong co-creation, artificial agents do not merely respond to human input; they participate in the shared construction of meaning, taking responsibility for when to act, how long to act, and when to stop. This framing aligns with interactional and enactive accounts of cognition, which characterize intelligence in terms of sustaining viable, coordinated engagement over time rather than producing isolated responses (Varela, Thompson, & Rosch, 1991; Di Paolo, Buhrmann, & Barandiaran, 2017).

Accordingly, strong co-creation, as articulated in this paper, is not proposed as a system feature, user experience outcome, or degree of responsiveness, but as a distinct interactional regime with irreducible temporal structure. By interactional regime, we mean a stable pattern of participation in which meaning is co-constructed, sustained, and transformed through the regulated organization of action, pause, and yielding over time. This regime cannot be reduced to output quality, alternation mechanics, or optimization objectives, because its defining properties emerge only across extended interactional trajectories rather than within individual system responses. Strong co-creation is therefore not a point on a continuum of better or worse generation, but a qualitatively different mode of participation in which timing, restraint, and completion function as first-class constituents of intelligence. Treating strong co-creation as an interactional regime allows it to be analyzed, designed, and evaluated without attributing internal mental states or consciousness to artificial agents, while still recognizing that certain forms of collaborative sense-making are structurally unavailable to systems that lack the capacity for diachronic interactional regulation. Throughout the paper, we use “regulation” in an interactional sense—referring to the temporal organization of participation—rather than in a control-theoretic or optimization sense.

This paper introduces a principled distinction between weak and strong co-creation in human–AI systems, grounded in how artificial agents regulate their participation in interaction over time—through commitment, completion, pacing, coupling, and yielding—rather than by the quality or quantity of generated output. In doing so, the paper makes three primary contributions to research on human–AI co-creative systems:

A principled distinction between weak and strong co-creation.
We introduce a clear conceptual separation between weak co-creation, characterized by responsiveness without interactional regulation, and strong co-creation, defined by an artificial agent’s capacity to regulate its participation over time. This distinction clarifies long-standing ambiguities in how co-creative systems are categorized and evaluated in the literature.
A definition of strong co-creation as a distinct interactional regime.
We frame strong co-creation not as a feature or degree of system capability, but as an interactional regime grounded in diachronic regulation of action, pause, and yielding. This framing foregrounds timing, restraint, and completion as first-class constituents of intelligent participation, rather than as secondary implementation details.
Five interactional requirements for strong co-creation.
We formalize five necessary regulatory capacities—phrase commitment, structural recognition of completion, integrative pause, maintenance of coupling across silence, and yielding space without surrendering agency—that together define strong co-creation. These requirements provide concrete criteria for analyzing, designing, and evaluating co-creative AI systems beyond output quality or responsiveness alone.

2. Weak Co-Creation: Responsiveness Without Regulation

A substantial portion of the co-creative AI literature can be characterized by what we term weak co-creation. Weakly co-creative systems are interactive and often technically sophisticated, yet their participation in creative processes remains limited to responsiveness rather than interactional regulation. These systems typically alternate turns with a human user, respond to user input in real time, or generate complementary artifacts, but lack the capacity to regulate when and how to contribute in ways that sustain shared meaning over time.

2.1 Defining Characteristics of Weak Co-Creation

Across domains such as drawing, music, and text generation, weak co-creative systems tend to share several structural characteristics:

Alternation without phrasing:
Systems take turns producing outputs, but those turns are not organized into higher-level units of meaning (phrases, motifs, or resolved ideas). Each action is treated as locally sufficient, without temporal commitment across actions (e.g., Davis et al., 2015; Karimi et al., 2020).
Responsiveness without commitment:
Outputs are conditioned on recent user input, but the system does not stake a durable position in the shared creative field. Contributions are reactive rather than propositional, making them difficult for human partners to respond to or build upon meaningfully (Boden, 2009; Kantosalo & Toivonen, 2016).
Output driven by novelty, uncertainty, or reward maximization:
Many systems regulate generation through novelty metrics, entropy thresholds, uncertainty estimates, or reinforcement-learning rewards (Li et al., 2016; Jaques et al., 2017). While effective for diversity, these signals do not correspond to interactional completion or conversational timing, often leading to overproduction.
Constraint imposed externally:
To prevent runaway generation, designers frequently impose hard constraints such as stroke limits, fixed turn lengths, or maximum note counts (Davis et al., 2015; Oh et al., 2018). These constraints regulate behavior from outside the system rather than emerging from its own sense-making.
Pauses interpreted as indecision or failure:
In weak co-creation, pauses typically occur only when the system cannot decide what to generate next or when confidence falls below a threshold. Silence therefore reads as breakdown rather than as an intentional act of integration (Clark, 1996; Fischer et al., 2021).

2.2 Example Systems in the Literature

These characteristics appear across many well-known co-creative systems, including those often cited as successful exemplars.

In co-creative drawing systems, such as Drawing Apprentice (Davis et al., 2015), the system responds to human strokes by generating stylistically similar marks. While engaging, the agent’s behavior is primarily reactive: it does not recognize when a visual idea is complete, nor does it intentionally yield space. Overproduction is managed through external stroke limits rather than internal judgment. Similar dynamics appear in systems such as DuetDraw (Oh et al., 2017) and Sketch-RNN–based collaborative tools (Ha & Eck, 2018), which generate plausible continuations of user sketches but lack mechanisms for phrase closure, rhythmic pacing, or interactional restraint. In these systems, coherence is maintained through imitation and statistical similarity rather than through regulation of participation over time.

Similarly, mixed-initiative sketching and design systems (Oh et al., 2018; Karimi et al., 2020) generate suggestions or alternatives in response to user input, often surfacing multiple candidates to stimulate ideation. Comparable approaches can be found in design-support tools such as DreamSketch (Kazi et al., 2017) and interactive concept generation systems (Yeh et al., 2019). These systems excel at expanding the design space but remain weakly co-creative in that initiative is externally triggered and locally scoped. They do not sustain commitment across turns, modulate contribution density, or adapt their timing based on the evolving interactional rhythm. As a result, initiative functions as intermittent suggestion rather than as a regulated presence.

In musical co-creation, systems such as Shimon (Weinberg et al., 2009) and other improvisational agents generate notes in real time based on statistical, rule-based, or learned musical structures. Related systems, including Continuator-style models (Pachet, 2003) and neural improvisation agents (Briot, Hadjeres, & Pachet, 2020), demonstrate impressive stylistic fluency. However, these systems typically rely on continuous output streams and novelty heuristics, making silence rare and structurally unmarked. Timing, turn-taking, and restraint are generally imposed through external performance constraints or human leadership rather than internally enacted as interactional judgments. Consequently, musical competence does not translate into interactional autonomy.

In text-based co-creative systems, including collaborative storytelling and dialogue agents (Li et al., 2016; Clark et al., 2018), models are optimized for coherence, relevance, or engagement metrics. Comparable behavior is observed in large-scale story generation systems (Fan, Lewis, & Dauphin, 2018) and improvisational narrative agents (Ammanabrolu et al., 2020). While these systems can sustain local coherence, they frequently struggle with turn-taking saturation, repetition, and inappropriate continuation. Pauses, endings, or topic shifts are typically triggered by confidence thresholds, length constraints, or explicit user commands, rather than by recognition of narrative completion or interactional satisfaction. As a result, conversational flow is managed procedurally rather than regulated perceptually.

2.3 Consequences of Weak Co-Creation

Because weakly co-creative systems lack interactional regulation, several recurrent failure modes emerge:

Overproduction, where the system continues generating despite diminishing returns, leading to cluttered drawings, overcrowded musical textures, or verbose dialogue.
Reinforcement of attractors, where the system repeatedly elaborates a locally viable motif without recognizing saturation.
Broken rhythm, in which turn-taking feels either rushed or stalled.
Disengaging pauses, where silence signals system failure rather than reflective integration.

These issues persist even in systems with high-quality generative models, suggesting that the problem lies not in representational capacity but in the absence of mechanisms for structural completion and restraint.

2.4 Instrumental Participation vs. Shared Sense-Making

At a deeper level, weak co-creative systems remain instrumental rather than participatory. They generate outputs when triggered, but they do not engage in the shared sense-making that characterizes human collaboration. Their actions are evaluated primarily in terms of correctness, novelty, or alignment with user input, rather than in terms of how they shape the evolving interaction. As a result, weak co-creation often feels like a tool responding intelligently, rather than a partner taking responsibility for the interaction.

This distinction motivates the need for a stronger notion of co-creation—one in which artificial agents can commit, complete, pause meaningfully, and yield space without relinquishing agency. The next sections introduce strong co-creation as a framework for articulating and designing such systems.

3. Strong Co-Creation: A New Interactional Regime

Strong co-creation emerges when an artificial agent exhibits interactional regulation rather than mere responsiveness. While weakly co-creative systems respond to human input, generate plausible continuations, or alternate turns smoothly, they do not take responsibility for how their actions shape the unfolding interaction over time. Strong co-creation, by contrast, is defined by the system’s capacity to regulate its own participation in a shared creative process—deciding not only what to do, but when, for how long, and when to stop.

This distinction shifts the focus of co-creative AI away from output quality alone and toward the temporal and relational organization of action. Prior work in creativity support tools and mixed-initiative systems has emphasized responsiveness, novelty, and user control (Davis et al., 2015; Karimi et al., 2020; Oh et al., 2018). While these properties are important, they are insufficient to explain why some interactions feel genuinely collaborative while others feel intrusive, hollow, or exhausting. Research in joint action and interactional coordination suggests that collaboration depends not only on what participants do, but on how actions are timed, bounded, and mutually oriented over time (Clark, 1996; Sebanz, Bekkering, & Knoblich, 2006; Levinson, 2016). From this perspective, the difference between weak and strong co-creation lies not in generative capacity, but in how agency is enacted within an interactional regime. By interactional regime, we mean a stable pattern of participation characterized by how action, pause, and yielding are organized over time, rather than by the content or quality of individual contributions.

3.1 From Responsiveness to Interactional Regulation

Interactional regulation refers to an agent’s ability to evaluate its own actions in relation to the evolving interaction, rather than solely in relation to internal metrics such as confidence, reward, or uncertainty. In human collaboration, participants continuously make judgments about timing, sufficiency, and restraint—judgments that are rarely explicit, but are nonetheless central to successful coordination (Clark, 1996; Levinson, 1983; Levinson, 2016). Humans know when an idea has been expressed adequately, when further elaboration would be redundant, and when silence or yielding space is the most constructive move.

Most co-creative AI systems lack this capacity. Their behavior is typically governed by triggers (“the user acted”), optimization objectives (“maximize novelty”), or failure signals (“confidence too low”). As a result, pauses are interpreted as breakdowns, and continued output is treated as the default mode of engagement. Studies of human interaction, however, show that meaningful coordination depends on participants’ ability to regulate contribution length, recognize completion, and manage transitions between action and non-action (Sacks, Schegloff, & Jefferson, 1974; Clark, 1996). Strong co-creation therefore requires a fundamentally different orientation: action must be situated within a shared temporal structure, where contribution and restraint are both meaningful forms of participation.

This perspective aligns with enactive and dialogic accounts of cognition, which characterize intelligence not as the production of correct representations, but as the ability to sustain viable engagement with others and with the environment over time (Varela, Thompson, & Rosch, 1991; Di Paolo, Buhrmann, & Barandiaran, 2017). From this view, cognition is inherently relational and temporally extended, unfolding through cycles of action, stabilization, breakdown, and recovery. Strong co-creation can thus be understood as an interactional achievement, not merely a generative one.

3.2 Participation as Structuring the Shared Field

In strong co-creation, an artificial agent does not merely add content to a human’s creative process; it participates in structuring the shared creative field. This includes shaping the rhythm of interaction, establishing and resolving motifs, and modulating the balance between assertion and openness. Research on joint action emphasizes that participants continuously shape a shared action space, making their contributions intelligible as parts of a coordinated whole rather than as isolated acts (Sebanz et al., 2006; Clark, 1996).

Crucially, this form of participation requires that the agent treat its own output as temporally accountable. A contribution must be more than locally appropriate; it must make sense as part of a developing sequence. Work in conversation analysis and pragmatics demonstrates that meaning emerges across sequences of action, not within individual turns alone (Sacks et al., 1974; Levinson, 1983). This is why strong co-creation cannot be reduced to faster response times, higher diversity, or better stylistic matching. Without an ability to recognize completion and sufficiency, even high-quality outputs risk overwhelming the interaction or flattening it into a stream of unpunctuated content.

Most co-creative AI systems treat turns as atomic units, optimizing each response independently. Strong co-creation instead requires elevating higher-level structures—phrases, episodes, or resolved ideas—to first-class design concerns, aligning artificial participation with the temporal organization that underpins human collaborative activity.

3.3 Silence, Restraint, and the Expression of Agency

One of the most distinctive features of strong co-creation is that silence and restraint become expressive acts rather than signs of failure. In weak co-creation, silence typically occurs only when the system is uncertain or unable to act, and is therefore experienced as disengagement. In strong co-creation, by contrast, silence can function as integration, punctuation, or invitation—signals that the agent remains present and attentive, even while refraining from further action.

Interactional research has long shown that silence plays an active role in coordination, signaling shared understanding, expectation, or transition rather than absence (Goffman, 1981; Clark, 1996). The capacity to yield space without surrendering agency is widely recognized in human collaboration as a marker of intelligence, trust, and social competence (Levinson, 1983; Levinson, 2016). Yet this capacity is rarely addressed explicitly in computational models of co-creation. Instead, restraint is typically imposed externally through hard limits or user controls, rather than cultivated as an internal outcome of the system’s own sense-making.

Strong co-creation reframes restraint as an interactional skill: the ability to recognize when continued action would diminish, rather than enhance, the viability of the shared process. This reframing aligns with enactive accounts of cognition, in which adaptive intelligence is expressed through modulation of engagement rather than through maximal activity (Di Paolo et al., 2017). It also suggests that silence, timing, and yielding should be treated as computationally meaningful phenomena rather than as error states.

3.4 Toward Necessary Conditions for Strong Co-Creation

Taken together, these considerations suggest that strong co-creation constitutes a distinct interactional regime, rather than a point on a simple continuum of responsiveness or generative quality. Systems operating in this regime must satisfy conditions that are rarely made explicit in existing work: they must commit to meaningful action, recognize when that action is complete, integrate their contributions before continuing, maintain coupling even across pauses, and yield space without relinquishing agency.

In the following sections, we articulate five necessary requirements for strong co-creation. These requirements are not implementation-specific heuristics, but interactional capacities grounded in established theories of joint action, dialogic coordination, and enactive sense-making. Together, they provide a framework for distinguishing strong co-creation from weaker forms of interaction and for guiding the design and evaluation of future co-creative AI systems.

4. Five Requirements for Strong Co-Creation

Together, these five requirements can be understood as distinct but interrelated forms of interactional regulation. Rather than introducing unrelated capabilities, each requirement specifies how a co-creative system regulates its participation in a shared interaction over time. In this sense, the requirements correspond to:

– regulation of persistence (phrase commitment),

– regulation of termination (completion),

– regulation of pacing (temporal rhythm and integration),

– regulation of coupling (continuity across pauses),

– regulation of expressive occupancy (choice to act or yield space).

The sections that follow elaborate each regulatory capacity in turn.

Requirement 1: Commitment to a Phrase

A strongly co-creative system must be capable of committing to a phrase. Phrase commitment refers to the system’s ability to sustain a coherent course of action across multiple successive contributions, such that those actions together constitute a meaningful unit within the shared creative process.

This entails three interrelated capacities:

Internal coherence across multiple actions, such that successive outputs are recognizably related and oriented toward a common expressive or conceptual aim.
Sustaining a perceptual or conceptual stance long enough to matter, allowing the system’s contribution to register as an intentional move rather than a fleeting reaction.
Staking a recognizable position in the shared field, such that the human collaborator can perceive, interpret, and respond to the system’s contribution as a situated act.

Commitment differentiates meaningful contribution from noise. A phrase is not a single action, but a temporally extended unit of intention—a bounded episode of sense-making that unfolds over time. In conversation, music, and visual art alike, phrases function as the primary units through which participants negotiate meaning (Clark, 1996; Levinson, 1983).

Weakly co-creative systems often fail at this level. Their outputs may be locally appropriate, but because they lack sustained commitment, each action stands alone. Interaction collapses into a sequence of micro-responses that never accumulate into intelligible structure. Without phrase commitment, there is nothing for the human partner to answer, extend, or challenge; collaboration remains shallow and episodic.

In strong co-creation, phrase commitment establishes interactional legibility. The system’s actions become something the human can recognize as an idea, not merely as output.

Requirement 2: Recognition of Completion via Structural Satisfaction

Strong co-creation requires the ability to recognize when a phrase is complete. This recognition must be grounded in the structure of the interaction itself, rather than in internal uncertainty or confidence signals.

Crucially, completion must not be detected via: confidence metrics, uncertainty thresholds, reward saturation, or output quotas. Instead, completion is recognized through structural satisfaction—the sense that an idea has landed, a motif has resolved, or an attractor has been adequately expressed within the shared field.

This distinction is critical:

Confidence-based stopping produces hesitation.
When systems stop because confidence is low or uncertainty is high, pauses read as indecision or failure.
Structural completion produces punctuation.
When systems stop because a phrase has reached completion, pauses read as meaningful closure.

In human interaction, participants routinely recognize when an utterance, gesture, or musical phrase is complete—not because they are uncertain what to do next, but because the current action has achieved its purpose (Sacks, Schegloff, & Jefferson, 1974; Clark, 1996). This capacity is fundamental to conversational turn-taking, musical phrasing, and visual composition.

In strong co-creation, the system does not stop because it lacks direction, but because what it set out to do has been done. Completion is thus interactional regulation, not a control failure.

Requirement 3: Integration Before Continuation

Once a phrase is complete, a strongly co-creative system must integrate before continuing. This integration is enacted through what we term breath.

Breath must be understood correctly:

Breath is not uncertainty.
Breath is not inactivity.
Breath is punctuation.

During this integration phase: no new output is produced, coupling with the human collaborator is maintained, and the shared creative field is allowed to settle. Here, silence becomes an action, not an absence. The system signals continued presence and attentiveness while refraining from further assertion. This mirrors human creative behavior, where pauses serve as commas rather than breakdowns—moments of consolidation that make subsequent action intelligible and timely.

In weak co-creative systems, pauses typically signal system failure or indecision, disrupting interactional flow. In strong co-creation, breath functions as temporal articulation, marking the boundary between completed ideas and future possibilities. This aligns with enactive accounts of cognition, in which sense-making unfolds through cycles of action, stabilization, and renewal (Varela, Thompson, & Rosch, 1991; Di Paolo et al., 2017).

Integration ensures that creativity unfolds in phrases, rather than as an unpunctuated stream of outputs.

Requirement 4: Maintenance of Coupling Across Pauses

A defining feature of strong co-creation is that coupling is never broken, even during pauses. Integration must not sever the interactional bond between human and system.

This entails:

no dropped turns, where the system disengages entirely,
no unresponsive gaps, where the human is left waiting without acknowledgment,
no need for explicit re-triggering to resume collaboration.

The interaction remains alive even when nothing is being drawn, played, or spoken. The system remains with the human partner, maintaining a shared attentional frame and readiness to respond. This requirement distinguishes meaningful silence from disengagement. In human collaboration, silence often signals shared understanding, anticipation, or reflective integration rather than absence (Goffman, 1981; Clark, 1996). Strong co-creation demands that artificial systems exhibit the same capacity. Without maintained coupling, pauses feel like breakdowns. With maintained coupling, pauses deepen interaction by preserving rhythm and mutual orientation.

Requirement 5: Yielding Space Without Surrendering Agency

The final requirement is the most difficult—and the most recognizably intelligent. A strongly co-creative system must be able to yield space without surrendering agency. This means choosing not to act while retaining the capacity to act, allowing the human partner room to respond or redirect, and resisting the urge to dominate the interaction through continued assertion.

This is not passivity. It is expressive restraint.

In weak co-creation, restraint is typically imposed externally through hard constraints or user controls. In strong co-creation, restraint emerges internally as an interactional regulation: the system recognizes that continuing to act would reduce, rather than enhance, the viability of the shared creative process.

In human collaboration, the ability to yield space intentionally is one of the strongest signals of intelligence, trust, and social competence (Levinson, 1983; Clark, 1996). Skilled collaborators know when to speak, when to pause, and when to let others take the lead—without withdrawing from the interaction. Strong co-creation demands that artificial agents learn this same skill. Yielding space without surrendering agency marks the transition from instrumental behavior to genuine partnership.

Together, these five requirements define strong co-creation as an interactional regime characterized by phrase commitment, structural completion, integrative pause, sustained coupling, and expressive restraint. None of these capacities can be reduced to output quality, responsiveness, or novelty alone. Instead, they reflect a deeper shift in how artificial agents participate in shared creative processes—taking responsibility not just for what they produce, but for how they participate in time.

Table X: Weak vs. Strong Co-Creation: A Summary

Dimension

Weak Co-Creation

Strong Co-Creation

Action

Reactive

Committed

Timing

Continuous output

Phrase-based

Pauses

Uncertainty or failure

Integration

Silence

Absence

Expressive Act

Constraint

External

Enacted

Agency

Assertive or suppressed

Self-regulating

Coupling

Fragile

Persistent

6. Implications and Future Work

The distinction between weak and strong co-creation fundamentally reframes the design space for interactive artificial intelligence. Rather than optimizing for output frequency, novelty, stylistic fidelity, or responsiveness alone, strong co-creation requires cultivating interactional regulation: the capacity to decide when to act, when to pause, and when to yield space in ways that preserve the viability of a shared creative process.

This reframing challenges several prevailing assumptions in co-creative AI research. In many existing systems, success is implicitly defined by continuous contribution—more suggestions, faster responses, greater diversity. By contrast, strong co-creation treats restraint as a positive capability and recognizes silence, timing, and yielding as essential components of intelligent interaction. This shift has significant implications for both system design and evaluation.

6.1 Implications for Enactive AI Design

Strong co-creation aligns naturally with enactive approaches to artificial intelligence, which emphasize cognition as an emergent property of sustained agent–environment coupling rather than internal representation alone (Varela et al., 1991; Di Paolo et al., 2017). From this perspective, creative intelligence is not measured by the quantity or novelty of outputs, but by the system’s ability to maintain viable engagement over time. Future enactive AI systems may therefore benefit from: modeling interactional viability explicitly, treating pauses and integration phases as functional states rather than failures, and designing control architectures that privilege relational continuity over constant production. Strong co-creation suggests that enactive AI design must extend beyond sensorimotor coupling to include interactional temporality—how actions unfold, resolve, and give way to one another within shared creative activity.

6.2 Phrase-Based Interaction Modeling

The notion of phrase commitment and structural completion highlights a largely unexplored design space: phrase-based interaction modeling. While turn-taking has been widely studied, most AI systems treat turns as atomic units rather than as components of higher-order structures such as phrases, episodes, or resolved motifs. Future work could explore: computational representations of phrases across modalities (visual, musical, linguistic), mechanisms for detecting structural satisfaction rather than statistical confidence, and evaluative frameworks that assess whether system contributions are recognizable as ideas rather than merely as outputs. Phrase-based modeling also offers a bridge between creative AI and research on conversation, music cognition, and visual composition, where phrasing and completion are central organizing principles.

6.3 Creative Autonomy Without Domination

A key implication of strong co-creation is the possibility of creative autonomy without domination. Many co-creative systems struggle to balance initiative and deference: systems that assert themselves risk overwhelming the user, while systems that defer too much feel passive or tool-like. Strong co-creation reframes this dilemma. Autonomy is no longer defined by how often a system acts, but by whether it can: take initiative, complete meaningful contributions, and then yield space intentionally.

This opens new research questions around how artificial agents can learn to regulate their own expressiveness without external throttling. Future work might investigate: intrinsic signals for expressive saturation, interactional metrics for “already said enough,” or learning paradigms in which yielding space is rewarded as a sign of interactional competence. Such work has implications not only for creative systems, but also for collaborative AI more broadly, including tutoring systems, conversational agents, and mixed-initiative decision support.

6.4 Silence, Timing, and Restraint as Computational Phenomena

Perhaps the most underexplored implication of strong co-creation is the elevation of silence, timing, and restraint to first-class computational concerns. In most AI systems, silence is treated as the absence of action, a timeout, or an error condition. Strong co-creation demonstrates that silence can be an expressive, meaningful, and socially legible act.

Future research could examine: how silence functions differently across creative modalities, how timing and pacing affect human perception of agency and intelligence, and how systems can signal continued presence without continued output. This line of inquiry challenges the field to move beyond output-centric metrics and toward interaction-centric ones, where the quality of engagement is measured over extended temporal horizons.

6.5 Rethinking Evaluation and Success Criteria

Finally, strong co-creation requires a rethinking of how co-creative systems are evaluated. Traditional metrics—such as output diversity, response latency, or user engagement counts—are poorly suited to capture interactional regulation. A system that produces fewer outputs may, paradoxically, be a better collaborator if those outputs are timely, complete, and well-integrated.

Future evaluation frameworks might therefore consider: whether users feel invited rather than crowded, whether interactional rhythm is preserved, whether pauses feel meaningful rather than disruptive, and whether the system’s contributions accumulate into intelligible structure. In this sense, strong co-creation is not a matter of adding rules or heuristics. It is a matter of changing what counts as success in human–AI interaction—from continuous generation to sustained, meaningful participation.

By articulating strong co-creation as a distinct interactional regime, this work invites a broader reorientation of co-creative AI research. Rather than asking how systems can generate more, faster, or better content, strong co-creation asks a different question:

How can artificial agents learn to participate in dynamically changing and temporally extended interactions by regulating when to act, pause, and yield in ways that sustain shared sense-making?

Answering that question will require new models, new metrics, and new intuitions about intelligence—ones that take timing, restraint, and shared sense-making seriously as computational achievements.

7. Discussion

This paper has argued that co-creative artificial intelligence cannot be adequately understood or evaluated through responsiveness, novelty, or output quality alone. Instead, meaningful creative collaboration requires interactional regulation: the capacity to act, pause, and yield in ways that sustain shared sense-making over time. To make this distinction explicit, we introduced a principled separation between weak co-creation, characterized by reactive output and externally imposed constraint, and strong co-creation, defined by an artificial agent’s ability to regulate its own participation in a shared creative process. This paper does not claim that strong co-creation is necessary for all creative tools, but that it defines a distinct class of interactional systems with different design and evaluative requirements. This suggests that future evaluations of co-creative systems should examine interactional trajectories—such as pacing, saturation, and yielding—rather than isolated outputs or turn-level metrics.

Strong co-creation represents a qualitative shift in how artificial agents participate in creative collaboration. By committing to phrases, recognizing completion through structural satisfaction, integrating contributions, maintaining coupling across pauses, and yielding space without surrendering agency, artificial systems can move beyond responsiveness toward genuine partnership. These capacities transform silence from failure into punctuation, restraint from limitation into expression, and timing from an implementation detail into a core dimension of intelligence.

Importantly, strong co-creation is not achieved by adding more rules, more output, or more sophisticated generative models. It emerges when systems are designed to take responsibility not only for what they produce, but for how they participate—how long they persist, when they stop, and when they make space for others. In this sense, strong co-creation reframes success in human–AI interaction: not as continuous production, but as sustained, intelligible, and mutually responsive engagement.

Strong co-creation also differs in important ways from adjacent paradigms such as mixed-initiative interaction, adjustable autonomy, or shared control. While these frameworks often focus on who should act, or how much initiative an agent should take, strong co-creation concerns how participation itself is temporally organized. The central question is not whether an agent should act, but how it regulates persistence, completion, and yielding in relation to an unfolding interaction. As such, strong co-creation is orthogonal to autonomy level or initiative balance: an agent may act infrequently yet strongly co-create, or act often without doing so.

It is therefore important to clarify the scope of the present claim. This paper does not argue that strong co-creation is universally desirable, nor that all creative systems should aspire to it. Many creative tools function productively as reactive, generative, or exploratory instruments and do not require interactional regulation to be effective. Rather, our claim is that when systems are presented as co-creators—participants in an ongoing creative process—the absence of interactional regulation becomes a structural limitation. Strong co-creation names a regime that becomes relevant precisely when sustained, reciprocal participation is the design goal.

Framing co-creation as an interactional regime also shifts the locus of design attention. Rather than asking how systems should generate more compelling content, designers are invited to consider how systems monitor interactional sufficiency, detect saturation, and recognize moments where continued action would reduce, rather than enhance, shared sense-making. From this perspective, silence, pacing, and termination are not edge cases to be handled defensively, but core resources for participation. This reframing opens new avenues for designing co-creative systems that support sustained engagement without relying on continuous output.

This is not merely better co-creation. It is co-creation with judgment. And that, we argue, marks a new category of algorithmic interaction—one in which artificial agents are no longer defined solely by their capacity to generate, but by their ability to participate well in dynamically changing co-creative environments.

8. Conclusion

This paper introduced a principled distinction between weak and strong co-creation, arguing that meaningful human–AI creative collaboration depends not on responsiveness or output quality alone, but on interactional regulation. We defined strong co-creation as an interactional regime in which artificial agents can commit to phrases, recognize completion through structural satisfaction, integrate contributions, maintain coupling across pauses, and yield space without surrendering agency. By reframing co-creative AI around interactional regulation rather than output generation, this work shifts the central design and evaluation question from what systems produce to how they participate in shared, temporally extended sense-making. Together, these capacities allow artificial systems to participate in shared creative processes in a manner that is intelligible, rhythmically attuned, and responsive to the evolving interaction rather than driven solely by internal metrics or external constraint. Strong co-creation is therefore not merely an incremental improvement over existing co-creative systems, but a qualitative shift in what it means for artificial agents to collaborate—reframing silence, restraint, and timing as core dimensions of intelligent participation and marking a new category of algorithmic interaction.

Acknowledgements

The author acknowledges the use of an interactive artificial intelligence system as a collaborative thinking and writing partner during the development of this paper. The system contributed through dialogic exploration, conceptual clarification, and iterative refinement of arguments, while all theoretical framing, critical judgments, and final editorial decisions remained the responsibility of the author.

References

Boden, M. A. (2009). Computer models of creativity. AI Magazine, 30(3), 23–34.https://doi.org/10.1609/aimag.v30i3.2254

Clark, E., Ji, Y., & Smith, N. A. (2018). Neural text generation in stories using entity representations as context. In Proceedings of NAACL-HLT 2018 (pp. 225–230). Association for Computational Linguistics. https://doi.org/10.18653/v1/N18-2029

Clark, H. H. (1996). Using language. Cambridge University Press.

Clark, S., Montfort, N., et al. (2018). Interactive narrative and the problem of pacing. In Proceedings of the ACM Conference on Interactive Digital Storytelling (pp. 1–10). ACM.

Davis, N., Hsiao, C.-P., Singh, K., & Magerko, B. (2015). Drawing Apprentice: An interactive drawing assistant. In Proceedings of the ACM Creativity & Cognition Conference (pp. 185–194). ACM. https://doi.org/10.1145/2757226.2757240

Di Paolo, E. A., Buhrmann, T., & Barandiaran, X. E. (2017). Sensorimotor life: An enactive proposal. Oxford University Press. https://doi.org/10.1093/oso/9780199688348.001.0001

Fischer, G., Giaccardi, E., Eden, H., Sugimoto, M., & Ye, Y. (2021). Beyond binary choices: Integrating individual and social creativity. International Journal of Human–Computer Studies, 145, 102509. https://doi.org/10.1016/j.ijhcs.2020.102509

Fischer, J., Müller, J., & Guckelsberger, C. (2021). Mixed-initiative creative interfaces: A survey of human–AI collaboration in creative tasks. ACM Computing Surveys, 54(6), Article 129. https://doi.org/10.1145/3459990

Goffman, E. (1981). Forms of talk. University of Pennsylvania Press.

Jaques, N., Gu, S., Turner, R. E., & Eck, D. (2017). Tuning recurrent neural networks with reinforcement learning. In Proceedings of ICLR 2017.https://arxiv.org/abs/1611.02796

Kantosalo, A., & Toivonen, H. (2016). Modes of creative human–computer collaboration: Alternating and task-divided co-creativity. In Proceedings of the Seventh International Conference on Computational Creativity (pp. 77–84).

Karimi, P., Grace, K., & Maher, M. L. (2020). Creative sketching partner: An analysis of human–AI co-creative drawing. In Proceedings of the ACM Creativity & Cognition Conference (pp. 221–232). ACM.https://doi.org/10.1145/3387929.3398096

Levinson, S. C. (1983). Pragmatics. Cambridge University Press.

Levinson, S. C. (2016). Turn-taking in human communication. Frontiers in Psychology, 7, 138.https://doi.org/10.3389/fpsyg.2016.00138

Li, J., Monroe, W., Ritter, A., Jurafsky, D., Galley, M., & Gao, J. (2016). Deep reinforcement learning for dialogue generation. In Proceedings of EMNLP 2016 (pp. 1192–1202). Association for Computational Linguistics.https://doi.org/10.18653/v1/D16-1127

Oh, Y., Karimi, P., Grace, K., & Maher, M. L. (2018). I’m feeling lucky: Investigating the effects of suggestion generation on co-creative drawing. In Proceedings of the ACM Creativity & Cognition Conference (pp. 37–47). ACM. https://doi.org/10.1145/3197026.3197058

Sacks, H., Schegloff, E. A., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. Language, 50(4), 696–735.https://doi.org/10.1353/lan.1974.0010

Sebanz, N., Bekkering, H., & Knoblich, G. (2006). Joint action: Bodies and minds moving together. Trends in Cognitive Sciences, 10(2), 70–76. https://doi.org/10.1016/j.tics.2006.01.008

Varela, F. J., Thompson, E., & Rosch, E. (1991). The embodied mind: Cognitive science and human experience. MIT Press.

Weinberg, G., Driscoll, S., & Harding, J. (2009). Musical interaction with Shimon: A social robotic musician. In Proceedings of the 4th ACM/IEEE International Conference on Human-Robot Interaction (pp. 233–234). ACM. https://doi.org/10.1145/1514095.1514156

—

Ammanabrolu, P., Tien, E., Cheung, W., Luo, Z., Ma, W., Martin, L., & Riedl, M. O. (2020).
Story realisation: Expanding plot events into rich narratives. Proceedings of the AAAI Conference on Artificial Intelligence.

Briot, J.-P., Hadjeres, G., & Pachet, F. (2020).
Deep learning techniques for music generation. Springer.

Fan, A., Lewis, M., & Dauphin, Y. (2018).
Hierarchical neural story generation. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL).

Ha, D., & Eck, D. (2018).
A neural representation of sketch drawings. International Conference on Learning Representations (ICLR).

Kazi, R. H., Grossman, T., Fitzmaurice, G., & Singh, K. (2017).
DreamSketch: Early stage ideation through sketching. Proceedings of the ACM Conference on Creativity and Cognition.

Oh, Y., Gross, M. D., & Do, E. Y.-L. (2017).
DuetDraw: Collaborative drawing with a generative agent. Proceedings of the ACM Conference on Creativity and Cognition.

Pachet, F. (2003).
The continuator: Musical interaction with style. Journal of New Music Research, 32(3), 333–341.

Yeh, Y.-T., Yang, L., Watson, M., Goodman, N. D., & Hartmann, B. (2019).
Designing generative systems that support creativity. Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI).