(PDF) Dynamic Autonomy Management in Hum

DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Dynamic Autonomy Management in Human-AI Command and Control for Autonomous Weapons Systems by Laszlo Pokorny Department of Military Science ICL Institute of Applied Sciences New Jersey, USA DOI: 10.5281/zenodo.19435927 2026 1 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Copyright © 2026 by Laszlo Pokorny All Rights Reserved 2 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 ABSTRACT The accelerating deployment of autonomous weapons systems (AWS) in military operations has created an urgent need for empirically validated frameworks governing the dynamic allocation of decision authority between human commanders and artificial intelligence systems. This dissertation developed, tested, and validated a Dynamic Autonomy Management (DAM) framework for human-AI command and control (C2) in autonomous weapons employment through a four-phase sequential mixed-methods design. Phase 1 applied grounded theory analysis to an 84-document corpus of policy directives, government reports, and international legal instruments, identifying Autonomy Governance as the core category with the highest centrality score among eight emergent categories. Phase 2 employed agent-based computational modeling with 13,500 Monte Carlo iterations across three C2 architectures and three threat conditions, quantifying the fundamental speed–accountability tradeoff: human-inthe-loop (HITL) maintained 97.8% accountability chain integrity but with 8.51-second mean response latency, while human-over-the-loop (HOVL) achieved 1.20-second latency at the cost of reduced accountability (68.2%). Phase 3 conducted simulation-based experimentation with 118 participants in a 3 × 3 factorial design, confirming large effects of autonomy level on response time (η²p = .73) and a trust–accuracy paradox wherein higher-autonomy systems produced better objective performance but lower operator trust. Phase 4 convened expert tabletop exercises with 18 defense professionals who rated the DAM framework positively across all five evaluation criteria, with Decision Traceability receiving the highest rating (M = 5.83, SD = 0.62) and Scalability the lowest (M = 4.72, SD = 1.18). Human-on-the-loop (HOTL) architecture consistently emerged across all four phases as the optimal default configuration, balancing operational tempo with meaningful human control. The DAM framework provides an 3 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 empirically grounded governance architecture for dynamic transitions of decision authority between human operators and autonomous weapons systems, directly informing Joint Chiefs of Staff doctrine development, DoDD 3000.09 implementation, and international autonomous weapons governance. Keywords: autonomous weapons systems, human-AI teaming, command and control, dynamic autonomy, meaningful human control, trust calibration, agent-based modeling 4 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table of Contents ABSTRACT........................................................................................................................ 3 CHAPTER 1: INTRODUCTION ..................................................................................... 24 Introduction ................................................................................................................... 24 Background of the Problem .......................................................................................... 26 The Rise of Autonomous Weapons Systems ............................................................ 26 The Command and Control Challenge ..................................................................... 29 The Governance Landscape ...................................................................................... 31 The Human Dimension ............................................................................................. 33 Statement of the Problem .............................................................................................. 35 Purpose of the Study ..................................................................................................... 37 Research Questions ....................................................................................................... 38 Research Question 1 ................................................................................................. 38 Research Question 2 ................................................................................................. 39 Research Question 3 ................................................................................................. 40 Table 1.1 Research Questions Mapped to Phases, Data Sources, and Methods ....... 41 Significance of the Study .............................................................................................. 42 Theoretical Significance ........................................................................................... 42 Practical Significance................................................................................................ 43 Policy Significance ................................................................................................... 44 5 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Societal Significance................................................................................................. 46 Theoretical Framework ................................................................................................. 47 Figure 1.1 Integrated Theoretical Framework for Dynamic Autonomy Management ............................................................................................................................................... 50 Research Design Overview ........................................................................................... 51 Phase 1: Qualitative Grounded Theory Analysis ...................................................... 52 Phase 2: Agent-Based Computational Modeling ...................................................... 52 Phase 3: Simulation-Based Experimentation ............................................................ 53 Phase 4: Expert Tabletop Validation ........................................................................ 54 Figure 1.2 Four-Phase Sequential Mixed-Methods Research Design Overview ..... 55 Scope and Delimitations ............................................................................................... 56 Definition of Key Terms ............................................................................................... 57 Table 1.2 Definition of Key Terms ........................................................................... 58 Assumptions.................................................................................................................. 59 Organization of the Dissertation ................................................................................... 61 Table 2.3 Organization of the Dissertation ............................................................... 62 Chapter Summary ......................................................................................................... 64 CHAPTER 2: LITERATURE REVIEW .......................................................................... 66 Introduction to the Literature Review ........................................................................... 66 Theoretical Foundations.................................................................................................... 68 6 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Systems Theory and Sociotechnical Systems ............................................................... 69 Human Factors and Cognitive Engineering .................................................................. 71 Naturalistic Decision-Making ....................................................................................... 73 Trust Theory.................................................................................................................. 75 Levels of Automation Framework ................................................................................ 77 Autonomous Weapons Systems: Development and Classification .................................. 80 Historical Evolution of Autonomous Weapons ............................................................ 80 Taxonomy and Classification of Autonomous Systems ............................................... 83 Current Autonomous and Semi-Autonomous Weapons Programs .............................. 85 Military Robotics and Unmanned Systems................................................................... 87 Counter-Autonomy and Adversarial Considerations .................................................... 89 Human-AI Teaming and Collaboration ............................................................................ 90 Foundations of Human-AI Teaming ............................................................................. 91 Trust in AI and Autonomous Systems .......................................................................... 93 Trust in Military AI Contexts ....................................................................................... 95 Human-Robot Interaction in Military Operations......................................................... 97 Team Performance and Effectiveness with AI Agents ................................................. 98 Dynamic Autonomy and Adaptive Control .................................................................... 100 Concepts of Dynamic and Sliding Autonomy ............................................................ 100 Mixed-Initiative Systems and Playbook Approaches ................................................. 102 7 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Function Allocation Methods ..................................................................................... 103 Ironies of Automation and Out-of-the-Loop Problems .............................................. 104 Context-Dependent Autonomy Allocation in Military Systems ................................. 106 Command and Control in the Age of AI ......................................................................... 108 Classical C2 Theory .................................................................................................... 108 Network-Centric Warfare and C2 Agility................................................................... 109 Mission Command Philosophy and AI Integration .................................................... 111 Joint All-Domain Command and Control (JADC2) ................................................... 112 AI Decision Support in Military C2............................................................................ 114 Meaningful Human Control and Governance ................................................................. 115 The Concept of Meaningful Human Control .............................................................. 115 DoD Directive 3000.09 and U.S. Policy Framework ................................................. 117 International Governance Efforts ................................................................................ 119 Accountability and Responsibility Frameworks ......................................................... 120 Technical Implementation of Human Control Mechanisms ....................................... 121 Legal and Ethical Frameworks ....................................................................................... 123 International Humanitarian Law Applied to Autonomous Weapons Systems ........... 123 Just War Theory and Autonomous Weapons.............................................................. 125 Ethical Perspectives on Autonomous Weapons Systems ........................................... 126 Human Dignity and the Ethics of Algorithmic Killing............................................... 128 8 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The Campaign to Stop Killer Robots and Civil Society Perspectives ........................ 129 Explainable AI and Transparency ................................................................................... 130 DARPA XAI Program and Military Applications ...................................................... 131 Explanation Types for Military Operators .................................................................. 132 XAI Evaluation Methods and Standards..................................................................... 133 Transparency Requirements for Autonomous Weapons Decision-Making ............... 134 Computational Modeling and Simulation ....................................................................... 135 Agent-Based Modeling in Defense ............................................................................. 136 Simulation-Based Analysis of Command and Control ............................................... 137 Wargaming and Computational Models ..................................................................... 138 Validation Challenges for Military Agent-Based Models .......................................... 139 Synthesis and Identification of Research Gaps............................................................... 140 Summary of Key Findings Across the Literature ....................................................... 140 Critical Research Gaps................................................................................................ 142 Conceptual Framework for the Present Study ............................................................ 144 Research Questions Revisited ..................................................................................... 145 CHAPTER 3: METHODOLOGY .................................................................................. 148 Introduction ................................................................................................................. 148 Research Design and Rationale ...................................................................................... 149 Overview of the Mixed-Methods Sequential Design.................................................. 149 9 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Rationale for the Mixed-Methods Approach .............................................................. 150 Integration Across Phases ........................................................................................... 152 Figure 3.1 Four-Phase Sequential Mixed-Methods Research Design .................... 153 Research Questions and Hypotheses .............................................................................. 153 Research Question 1 ................................................................................................... 153 Research Question 2 ................................................................................................... 154 Research Question 3 ................................................................................................... 155 Table 3.1 Research Questions, Data Sources, Variables, and Analytic Methods ... 156 Population, Sampling, and Units of Analysis ................................................................. 157 Document Corpus and Artifact Populations ............................................................... 157 Weapons Systems and Technical Data Populations ................................................... 158 Sampling Logic ........................................................................................................... 158 Units of Analysis......................................................................................................... 159 Data Sources and Data Collection Procedures................................................................ 160 Congressional Testimony and Legislative Records .................................................... 160 Government Accountability Office Reports ............................................................... 160 Congressional Research Service Reports.................................................................... 161 SIPRI Autonomous Weapons Data............................................................................. 161 DoD Directive 3000.09 Parameters ............................................................................ 162 Weapons Performance Data ........................................................................................ 162 10 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Think-Tank Publications and Case Study Data .......................................................... 162 DARPA Assured Autonomy Program Data ............................................................... 163 Table 3.2 Dataset Inventory and Intended Use ....................................................... 163 Variables, Constructs, and Operational Definitions ....................................................... 164 Dynamic Autonomy .................................................................................................... 164 Meaningful Human Control ........................................................................................ 165 Trust Calibration ......................................................................................................... 165 Decision Quality ......................................................................................................... 166 Response Latency ....................................................................................................... 166 Accountability Chain Integrity ................................................................................... 166 Additional Constructs ................................................................................................. 167 Table 3.3 Constructs, Operational Definitions, Indicators, and Measurement Strategy ............................................................................................................................................. 168 Instrumentation and Measures ........................................................................................ 169 Qualitative Coding Framework................................................................................... 169 Trust and Cognitive Load Measurement..................................................................... 170 Scenario Evaluation Rubrics ....................................................................................... 170 Autonomy Condition Manipulations .......................................................................... 171 Phase 1: Qualitative Grounded Theory Procedures ........................................................ 171 Document Corpus Development ................................................................................. 171 11 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Coding Procedures ...................................................................................................... 172 Memoing and Constant Comparison .......................................................................... 172 Trustworthiness Procedures ........................................................................................ 173 Phase 1 Outputs and Phase 2 Connection ................................................................... 173 Phase 2: Agent-Based Computational Modeling Procedures ......................................... 174 Model Design and Parameterization ........................................................................... 174 Agent Specifications ................................................................................................... 174 C2 Architecture Implementation................................................................................. 175 Scenario Matrix........................................................................................................... 175 Model Calibration, Sensitivity Analysis, and Validation ........................................... 176 Table 3.4 Agent-Based Model Entities, Rules, and Outputs .................................. 177 Phase 3: Simulation-Based Experimental Procedures .................................................... 177 Scenario Design .......................................................................................................... 177 Experimental Design and Conditions ......................................................................... 178 Participant Assignment and Randomization ............................................................... 178 Pilot Testing ................................................................................................................ 179 Data Capture and Logging .......................................................................................... 179 Table 3.5 Experimental Conditions and Outcome Measures.................................. 180 Phase 4: Tabletop Exercise Validation Procedures ........................................................ 180 Tabletop Exercise Design ........................................................................................... 180 12 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Participant Roles and Recruitment.............................................................................. 181 Injects and Facilitation ................................................................................................ 181 Evaluation Forms and Validation Criteria .................................................................. 182 Table 3.6 Tabletop Exercise Validation Matrix ...................................................... 183 Data Analysis Plan .......................................................................................................... 183 Qualitative Analysis: Phase 1 and Phase 4 ................................................................. 183 Quantitative Analysis: Phases 2 and 3 ........................................................................ 184 Mixed-Methods Integration ........................................................................................ 185 Reliability, Validity, Trustworthiness, and Rigor ........................................................... 186 Qualitative Trustworthiness ........................................................................................ 186 Quantitative Validity................................................................................................... 186 Model Validity for Agent-Based Modeling ................................................................ 187 Scenario Validity for Simulations and Tabletop Exercises ........................................ 188 Ethical Considerations .................................................................................................... 188 Use of Public and Unclassified Data .......................................................................... 188 Data Security and Confidentiality............................................................................... 189 Dual-Use Concerns and Responsible Research .......................................................... 189 Limitations and Delimitations of the Methodology ........................................................ 189 Methodological Limitations ........................................................................................ 189 Delimitations ............................................................................................................... 190 13 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Chapter Summary ........................................................................................................... 191 CHAPTER 4: RESULTS ................................................................................................ 193 Introduction ................................................................................................................. 193 Phase 1: Qualitative Grounded Theory Results .......................................................... 195 Document Corpus Description ................................................................................ 195 Open Coding Results .............................................................................................. 196 Table 4.1 Theme Frequency Distribution Across Document Corpus (N = 84) ...... 197 Figure 4.1 Theme Frequency Distribution Across the 84-Document Corpus, ColorCoded by Category ............................................................................................................. 198 Figure 4.2 Theme Distribution by Document Source Category ............................. 199 Axial Coding and Theme Relationships ................................................................. 201 Table 4.2 Top 15 Axial Coding Relationships by Co-occurrence Frequency ........ 201 Figure 4.3 Code Co-occurrence Heatmap Across the 19 Thematic Codes............. 203 Figure 4.4 Heatmap of Top Cross-Category Axial Coding Relationships by Jaccard Similarity............................................................................................................................. 204 Selective Coding: Core Categories ......................................................................... 204 Table 4.3 Category Centrality Scores from Selective Coding Analysis ................. 205 Table 4.4 Hierarchical Theme Taxonomy: Categories and Constituent Themes ... 206 Figure 4.5 Hierarchical Theme Taxonomy Visualization Showing Category Structure and Theme Frequencies....................................................................................... 207 14 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Emergent Theoretical Framework .......................................................................... 207 Key Qualitative Findings ........................................................................................ 208 Phase 2: Agent-Based Modeling Results .................................................................... 209 Model Calibration and Validation .......................................................................... 209 Architecture Comparison Results ........................................................................... 210 Table 4.5 Agent-Based Model Performance Metrics by C2 Architecture (N = 4,500 per Architecture) ................................................................................................................. 210 Figure 4.6 Performance Metrics Comparison Across Three C2 Architectures ...... 213 Table 4.6 Architecture × Threat Condition Interaction: Mean Performance Metrics ............................................................................................................................................. 213 Figure 4.7 Response Latency Distributions by C2 Architecture and Threat Condition ............................................................................................................................................. 214 Figure 4.8 Architecture × Threat Condition Interaction for All Performance Metrics ............................................................................................................................................. 214 Monte Carlo Simulation Outcomes ........................................................................ 214 Figure 4.9 Mission Success Rate Across Threat Conditions by C2 Architecture .. 215 Sensitivity Analysis ................................................................................................ 215 Table 4.7 Sensitivity Analysis Results: Parameter Effects on Mission Success Rate ............................................................................................................................................. 216 Figure 4.10 Tornado Diagram Showing Parameter Sensitivity on Mission Success Rate ..................................................................................................................................... 217 15 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Key ABM Findings ................................................................................................. 217 Phase 3: Simulation-Based Experimental Results ...................................................... 218 Participant and Data Overview ............................................................................... 218 Descriptive Statistics............................................................................................... 219 Table 4.8 Overall Descriptive Statistics for Dependent Variables (N = 118) ........ 219 Table 4.9 Descriptive Statistics by Experimental Condition (Autonomy Level × Threat Tempo) .................................................................................................................... 219 Figure 4.11 Grouped Bar Charts With Error Bars for All Dependent Variables by Autonomy Level and Threat Tempo ................................................................................... 220 MANOVA Results .................................................................................................. 220 Univariate ANOVA Results for Each Dependent Variable .................................... 221 Table 4.10 Summary of Two-Way ANOVA Results for All Dependent Variables221 Figure 4.12 Interaction Plots for All Five Dependent Variables (Autonomy Level × Threat Tempo) .................................................................................................................... 223 Effect Size Summary .............................................................................................. 226 Figure 4.13 Partial Eta-Squared Effect Sizes for All Factor-DV Combinations .... 227 Post-Hoc Comparisons............................................................................................ 227 Table 4.11 Tukey HSD Post-Hoc Pairwise Comparisons for All Dependent Variables ............................................................................................................................. 227 Figure 4.14 Distribution Histograms for All Dependent Variables by Autonomy Level ................................................................................................................................... 229 16 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Key Experimental Findings .................................................................................... 229 Phase 4: Tabletop Exercise Validation Results .......................................................... 230 Expert Panel Description ........................................................................................ 230 Quantitative Validation Ratings.............................................................................. 230 Table 4.12 Expert Validation Ratings: Descriptive Statistics (N = 18) .................. 231 Figure 4.15 Radar Chart of Mean Expert Ratings Across Five Validation Criteria 232 Table 4.13 One-Sample t-Tests Against Neutral Midpoint (Test Value = 4.0) ...... 232 Figure 4.16 Box Plots of Expert Ratings by Validation Criterion With Individual Data Points .......................................................................................................................... 233 Inter-Rater Reliability ............................................................................................. 233 Table 4.14 Intraclass Correlation Coefficient Results ............................................ 233 Qualitative Feedback Themes ................................................................................. 234 Table 4.15 Expert Qualitative Feedback Themes With Representative Quotes (N = 18) ....................................................................................................................................... 234 Figure 4.17 Mean Expert Ratings by Professional Background Category ............. 236 Key Validation Findings ......................................................................................... 236 Cross-Phase Integration and Convergence ................................................................. 237 Convergence of Findings Across Phases ................................................................ 237 Addressing Research Questions.............................................................................. 238 Table 4.16 Research Questions Mapped to Cross-Phase Findings ......................... 238 17 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The Dynamic Autonomy Management Framework ............................................... 240 Unexpected Findings and Emergent Insights ......................................................... 241 Chapter Summary ....................................................................................................... 242 CHAPTER 5: DISCUSSION.......................................................................................... 245 Introduction ................................................................................................................. 245 Interpretation of Findings ............................................................................................... 247 Research Question 1: Dynamic Autonomy Frameworks............................................ 248 Qualitative Foundations of Dynamic Autonomy .................................................... 249 Agent-Based Modeling and the Speed-Accountability Tradeoff............................ 251 Dynamic Autonomy and Existing Governance Frameworks ................................. 252 Research Question 2: Human-AI Trust and Decision Quality.................................... 255 Trust Calibration Across Autonomy Levels ........................................................... 255 The Trust-Accuracy Paradox Under High-Tempo Conditions ............................... 259 Cognitive Load and the Dynamics of Authority Transfer ...................................... 261 Implications for Transfer-of-Control Protocol Design ........................................... 262 Research Question 3: Operational Validation and Implementation ........................... 263 Expert Validation and Operational Feasibility ....................................................... 263 Scalability: The Primary Implementation Challenge .............................................. 265 Connecting to Boyd’s OODA Loop and JADC2 .................................................... 266 Expert Qualitative Feedback in Context ................................................................. 267 18 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The Dynamic Autonomy Management (DAM) Framework .......................................... 268 Framework Overview and Architecture...................................................................... 269 Figure 5.1 The Dynamic Autonomy Management (DAM) Framework Architecture ............................................................................................................................................. 271 Theoretical Contributions ........................................................................................... 272 Comparison with Existing Frameworks...................................................................... 274 Table 5.2 Comparison of DAM Framework with Existing Autonomy Frameworks ............................................................................................................................................. 275 Implications..................................................................................................................... 277 Implications for Theory .............................................................................................. 277 Contributions to Human-AI Teaming Theory ........................................................ 277 Contributions to Trust Calibration in Autonomous Systems .................................. 278 Contributions to C2 Theory in the Age of AI ......................................................... 280 Contributions to Meaningful Human Control Discourse ........................................ 281 Implications for Military Practice and Operations...................................................... 281 C2 Architecture Design........................................................................................... 282 Training Implications .............................................................................................. 282 Doctrine Development ............................................................................................ 283 Force Design and Operational Planning ................................................................. 284 Implications for Policy and Senior Leadership ........................................................... 285 19 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Recommendations for the Joint Chiefs of Staff ...................................................... 286 Implications for DoDD 3000.09 Revision .............................................................. 286 Defense Industrial Base Recommendations............................................................ 287 International Governance Implications ................................................................... 288 Table 5.3 Policy Recommendations Summary ....................................................... 290 Limitations ...................................................................................................................... 291 Methodological Limitations ........................................................................................ 291 Simulated Versus Real Experimental Conditions ................................................... 291 Use of Publicly Available and Unclassified Data Only .......................................... 292 Agent-Based Model Simplifications ....................................................................... 293 Generalizability Constraints.................................................................................... 294 Scope Limitations ....................................................................................................... 295 Focus on U.S. Military Context .............................................................................. 295 Temporal Limitations.............................................................................................. 295 Mitigation Strategies Employed ................................................................................. 295 Recommendations for Future Research .......................................................................... 296 Extending the DAM Framework ................................................................................ 297 Live Field Testing with Military Operators ............................................................ 297 Longitudinal Trust Evolution Studies ..................................................................... 298 Cross-Cultural and Coalition Validation ................................................................ 298 20 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Emerging Research Directions ................................................................................... 298 LLM-Mediated C2 and Generative AI in Autonomous Weapons .......................... 298 Swarm Autonomy Management ............................................................................. 299 Adversarial AI and Counter-Autonomy.................................................................. 299 Neurobiological Trust Metrics ................................................................................ 300 Methodological Advances .......................................................................................... 300 Digital Twin Approaches for C2 Testing................................................................ 300 Real-Time Trust Measurement in Operational Settings ......................................... 300 Table 5.4 Future Research Agenda ......................................................................... 301 Table 5.5 Summary of Findings by Research Question ......................................... 302 Conclusions ..................................................................................................................... 303 CHAPTER 6: CONCLUSION ....................................................................................... 308 Overview of the Study ................................................................................................ 308 Summary of Key Findings and Contributions ................................................................ 311 The Speed-Accountability Tradeoff ........................................................................... 311 The Trust-Accuracy Paradox ...................................................................................... 313 The Dynamic Autonomy Management Framework ................................................... 315 Table 6.1 Dynamic Autonomy Management Framework Components Summary. 318 Contributions to Knowledge ....................................................................................... 319 Theoretical Contributions ....................................................................................... 319 21 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Methodological Contributions ................................................................................ 320 Practical Contributions............................................................................................ 320 Table 6.2 Summary of Dissertation Contributions ................................................. 321 Strategic Recommendations............................................................................................ 322 Recommendations for Force Design and C2 Architecture ......................................... 322 Table 6.3 Strategic Recommendations for Force Design and C2 Architecture ...... 324 Recommendations for Doctrine and Training............................................................. 325 Recommendations for Acquisition and the Defense Industrial Base.......................... 328 Recommendations for Policy and International Engagement ..................................... 331 A Roadmap for Future Research..................................................................................... 333 Near-Term Research Priorities (1–3 Years) ............................................................... 334 Medium-Term Research Priorities (3–5 Years) .......................................................... 335 Long-Term Research Horizons (5–10 Years) ............................................................. 336 Table 6.4 Research Roadmap for Dynamic Autonomy Management .................... 338 Reflections on the Research Journey .............................................................................. 339 Final Statement ............................................................................................................... 342 Figure 6.1 Dynamic Autonomy Management Framework Architecture ................ 345 REFERENCES ............................................................................................................... 347 APPENDIX A: DATA SOURCES AND CODING ...................................................... 371 A.1 Overview .................................................................................................................. 371 22 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 A.2 Document Corpus Inventory .................................................................................... 372 A.3 Qualitative Coding Framework................................................................................ 383 A.4 Data Quality and Trustworthiness............................................................................ 394 A.5 Data Access and Replication Guide ......................................................................... 397 APPENDIX B: QUANTITATIVE ANALYSIS SUMMARY ....................................... 401 B.1 Overview .................................................................................................................. 401 B.2 Phase 2: Agent-Based Modeling — Complete Results............................................ 402 B.3 Phase 3: Simulation-Based Experiment — Complete Statistical Output ................. 414 B.4 Phase 4: Tabletop Exercise Validation — Complete Results .................................. 434 B.5 Cross-Phase Statistical Integration ........................................................................... 443 23 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 CHAPTER 1: INTRODUCTION Introduction On March 15, 2023, the Deputy Secretary of Defense issued updated guidance on autonomy in weapon systems that fundamentally altered the governance landscape for autonomous military technologies (U.S. Department of Defense, 2023). Department of Defense Directive 3000.09, originally signed in 2012 and revised after more than a decade of technological acceleration, established that autonomous and semi-autonomous weapon systems shall be designed to allow commanders and operators to exercise appropriate levels of human judgment over the use of force. Yet the directive left unanswered the most consequential operational question facing military commanders in the age of artificial intelligence: precisely how, when, and under what conditions should decision authority transition between human operators and autonomous weapons systems during the dynamic, high-tempo operations that characterize modern warfare? This question is not abstract. As of 2026, the United States, China, Russia, Israel, Turkey, and at least a dozen other nations are developing or fielding autonomous weapons systems with increasing levels of decision-making capability (Stockholm International Peace Research Institute [SIPRI], 2022; Congressional Research Service [CRS], 2024). The Joint All-Domain Command and Control (JADC2) vision calls for AI-enabled decision-making at machine speed across air, land, sea, space, and cyberspace domains (Special Competitive Studies Project, 2024). Meanwhile, adversary investments in autonomous military capabilities are accelerating at a pace that challenges the assumption of American technological superiority (Scharre, 2023). The strategic environment demands systems that can operate with sufficient speed to maintain competitive advantage, while the legal, ethical, and democratic foundations of American military 24 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 power demand that human judgment remain meaningfully engaged in decisions involving lethal force. The strategic implications of this technological transformation cannot be overstated. The 2022 National Defense Strategy identified integrated deterrence and the ability to prevail in conflict as central priorities, both of which depend on the effective integration of autonomous capabilities into military operations. The People’s Republic of China has declared its intention to become the world leader in artificial intelligence by 2030, with explicit military applications forming a core component of that strategy. Russia’s investments in autonomous weapons, from the Poseidon autonomous torpedo to the Uran-9 unmanned combat ground vehicle, reflect a determination to leverage autonomous capabilities for strategic advantage. In this competitive environment, the nation that develops the most effective frameworks for human-AI command and control—frameworks that maximize autonomous capability while maintaining the accountability and oversight that legitimate military force requires—will possess a decisive strategic advantage. This central tension—between the imperative of speed and the imperative of human control—defines the research problem that this dissertation addresses. Despite the proliferation of autonomous weapons programs, the growing sophistication of AI-enabled command and control architectures, and the intensifying international debate over autonomous weapons governance, no empirically validated framework exists for managing the dynamic transition of decision authority between human operators and autonomous systems in military command and control environments (Pokorny, 2026). This absence is not merely a gap in the scholarly literature; it represents a critical vulnerability in the national security infrastructure of the United States and its allies. 25 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 This dissertation was designed to close that gap. Through a four-phase sequential mixedmethods investigation integrating qualitative grounded theory, agent-based computational modeling, simulation-based experimentation, and expert tabletop validation, this research developed, tested, and validated a Dynamic Autonomy Management (DAM) framework that provides empirically grounded principles and protocols for governing the allocation of decision authority between human commanders and autonomous weapons AI across the spectrum of military operations. The findings directly inform Joint Chiefs of Staff doctrine development, DoDD 3000.09 implementation guidance, and the broader international discourse on autonomous weapons governance. Background of the Problem The Rise of Autonomous Weapons Systems The development of autonomous weapons systems represents a trajectory stretching back decades, from early automated defense systems such as the Phalanx Close-In Weapons System, deployed by the U.S. Navy in the 1980s, to the increasingly sophisticated autonomous platforms under development today (Scharre, 2018). The Phalanx and its contemporaries—including Israel’s Iron Dome and the U.S. Patriot missile system—operated under relatively constrained autonomy: their engagement envelopes were narrow, their targeting parameters were predefined, and the environments in which they operated were sufficiently structured that automated decision-making could proceed without the ambiguity that characterizes most military operations (Williams & Scharre, 2015). These systems established the precedent that machines could make engagement decisions faster than humans in certain well-defined scenarios, but they did not fundamentally challenge the principle of human control over the use of force. 26 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The transformation accelerated dramatically with the maturation of machine learning, computer vision, natural language processing, and multi-agent coordination algorithms. Modern autonomous weapons systems are distinguished from their predecessors by their capacity to operate in unstructured environments, adapt to novel situations, and make targeting decisions based on learned representations rather than pre-programmed rules (U.S. Air Force, 2015). The Congressional Research Service (2024) documented that over 30 nations are now developing autonomous military systems of varying capability, with investments accelerating particularly among the United States, China, Russia, and regional powers including Turkey, Israel, South Korea, and the United Kingdom. The Stockholm International Peace Research Institute has tracked a marked increase in the operational deployment of autonomous and semi-autonomous systems, particularly in the domains of aerial surveillance, maritime patrol, and ground-based logistics (SIPRI, 2022). The conflict in Ukraine, beginning in 2022 and continuing through the present, has served as a proving ground for autonomous military technologies at a scale and intensity not previously observed (van der Velde et al., 2021). Both sides have employed autonomous drones for reconnaissance, targeting, and strike operations, providing real-world data on the operational advantages—and the governance challenges—of deploying autonomous systems in contested environments. These developments have accelerated what many defense analysts describe as a global autonomous weapons arms race, in which the pace of technological development threatens to outstrip the capacity of governance frameworks to maintain meaningful human oversight (Horowitz, 2019). The taxonomic classification of autonomous weapons systems reflects the complexity of the governance challenge. The Department of Defense distinguishes between autonomous 27 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 weapon systems, which can select and engage targets without further intervention by a human operator, and semi-autonomous weapon systems, which employ autonomy for engagementrelated functions but require human operator action to complete engagement (U.S. Department of Defense, 2023). This binary classification, while administratively useful, obscures the reality that autonomy exists on a spectrum and that the degree of human involvement may vary dynamically within a single operational mission. A surveillance drone that operates autonomously during patrol may require human authorization for target engagement, human monitoring during tracking, and fully autonomous operation during evasive maneuvers—all within the same sortie. The governance frameworks needed for such systems must be dynamic rather than static, adapting to the changing demands of the operational environment in real time. The global proliferation of autonomous weapons technology has introduced additional urgency to the governance challenge. Nations with varying levels of democratic accountability, international legal compliance, and ethical commitment are developing autonomous weapons capabilities, creating a diverse and potentially destabilizing landscape of autonomous military systems with inconsistent governance standards (Horowitz, 2019). China’s military AI strategy explicitly prioritizes autonomous systems for future warfare, while Russia has invested significantly in autonomous ground vehicles, underwater systems, and drone swarms (Congressional Research Service, 2024). Regional powers including Turkey and Iran have demonstrated operational deployment of autonomous attack drones, establishing precedents for the use of autonomous weapons outside the frameworks of established military powers. This proliferation environment means that the governance frameworks developed by the United States will not operate in isolation but must be robust enough to maintain accountability and 28 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 effectiveness in contested environments where adversaries may employ autonomous systems with fewer constraints. The acceleration of AI capabilities—particularly the emergence of large language models, foundation models, and multi-modal AI systems—has further expanded the potential scope of autonomous weapons. Systems that once operated within narrow task domains can now integrate information from multiple sensor modalities, reason about complex tactical situations, and generate courses of action that account for constraints ranging from rules of engagement to collateral damage estimation (Strouse et al., 2024). These capabilities raise the prospect of autonomous systems that are not merely automated—executing pre-programmed responses—but genuinely autonomous in the sense of exercising judgment-like functions in novel, ambiguous, and high-consequence situations. The Command and Control Challenge The integration of autonomous weapons into military command and control architectures presents challenges that go well beyond the design of individual weapon systems. Command and control—the exercise of authority and direction by a properly designated commander over assigned and attached forces in the accomplishment of a mission (U.S. Army, 2019)—is fundamentally a human process, rooted in the commander’s authority, judgment, and accountability. The introduction of autonomous systems that can perceive, decide, and act without direct human authorization disrupts the foundational logic of military C2 in ways that existing doctrine has not yet fully addressed (Alberts, 2011). The Joint All-Domain Command and Control (JADC2) concept envisions a networked, AI-enabled C2 architecture capable of integrating sensors, shooters, and decision-makers across all warfighting domains at machine speed (Special Competitive Studies Project, 2024). JADC2 29 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 represents the Department of Defense’s recognition that future conflicts will be decided by the speed and quality of decision-making, and that AI must play a central role in achieving decision advantage. However, the JADC2 vision raises profound questions about the nature of human authority in a system where AI agents may perceive threats, recommend responses, and execute actions faster than human operators can comprehend the situation. The tension between centralized command authority and the distributed, high-speed execution enabled by autonomous systems has become the central design challenge for next-generation C2 architectures (Alberts, 2011). Three paradigms have emerged for structuring the human-AI relationship in command and control for autonomous weapons. Human-in-the-loop (HITL) systems require human authorization for critical decisions, particularly engagement decisions involving the use of lethal force. Human-on-the-loop (HOTL) systems allow the AI to initiate actions autonomously while maintaining human supervisory oversight and the ability to intervene or abort. Human-over-theloop (HOVL) systems operate within human-defined parameters and governance constraints, with the human role limited to setting strategic objectives and rules rather than supervising individual actions (Nadibaidze et al., 2025). Each paradigm presents distinct tradeoffs between operational speed, accountability, and the degree of meaningful human control—tradeoffs that, prior to this research, had not been empirically quantified or comparatively assessed under realistic operational conditions. The operational implications of these architectural choices extend beyond abstract governance questions to the tactical level of engagement. In a contested anti-access/area-denial (A2/AD) environment, an autonomous weapons system may need to prosecute a time-critical target within seconds of detection—a timeline that may not permit human-in-the-loop 30 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 authorization without sacrificing mission success. Conversely, a strike against a target in a civilian-populated area demands the highest levels of human judgment regarding proportionality, distinction, and necessity—judgment that cannot be delegated to an autonomous system regardless of its technical capability. The challenge for military C2 is not to select a single point on the autonomy spectrum but to design systems capable of moving dynamically along that spectrum as operational conditions demand, while maintaining governance integrity at every point. The concept of multi-domain operations further complicates the C2 challenge. Future conflicts are expected to involve the simultaneous employment of autonomous systems across air, land, sea, space, and cyber domains, requiring C2 architectures that can coordinate dozens or hundreds of autonomous platforms under unified human authority (Special Competitive Studies Project, 2024). The cognitive demands of supervising a single autonomous system are significant; the demands of supervising a swarm of autonomous systems operating across multiple domains are without precedent in military history. This scalability challenge was identified consistently across all four phases of the present research as the primary limitation of current approaches to human-AI C2. The Governance Landscape The governance of autonomous weapons systems operates at the intersection of national policy, international law, and military doctrine. At the national level, the United States has pursued a policy-driven approach anchored in Department of Defense Directive 3000.09, which establishes that autonomous and semi-autonomous weapon systems shall be designed to allow appropriate levels of human judgment over the use of force (U.S. Department of Defense, 2023). The directive requires senior-level review and approval for the development and fielding of 31 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 autonomous weapons, mandates that systems incorporate safeguards against unintended engagements, and calls for training programs that ensure operators and commanders understand the capabilities and limitations of autonomous systems. However, the directive provides highlevel policy guidance rather than operational protocols; it does not specify how dynamic transitions between autonomy levels should be managed during active operations, nor does it prescribe validated mechanisms for ensuring accountability chain integrity when decision authority shifts between human and machine. At the international level, discussions within the United Nations Convention on Certain Conventional Weapons (CCW) framework have sought to establish norms for lethal autonomous weapons systems (LAWS) since 2014, but progress has been slow and politically contested (United Nations Institute for Disarmament Research [UNIDIR], 2025). While there is broad agreement on the principle that human responsibility must be maintained in decisions involving the use of force, the operationalization of this principle—what constitutes “meaningful human control” and how it should be verified—remains deeply contested (Santoni de Sio & van den Hoven, 2018). The absence of international consensus has created a governance vacuum in which states are developing and deploying autonomous weapons under widely varying standards of human oversight, raising concerns about accountability, escalation dynamics, and compliance with international humanitarian law (Sharkey, 2012). The requirement for legal review of new weapons under Article 36 of Additional Protocol I to the Geneva Conventions adds another dimension to the governance challenge. States that are party to the protocol are obligated to determine whether the employment of any new weapon would be prohibited by international law (UNIDIR, 2025). For autonomous weapons, this review must assess not only the weapon’s physical characteristics but also its 32 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 decision-making algorithms, the conditions under which it exercises autonomous judgment, and the mechanisms by which human control is maintained or can be reasserted. The novelty and complexity of these considerations strain existing Article 36 review processes, which were designed for weapons with fixed, predictable capabilities rather than adaptive, learning systems whose behavior may vary across contexts (Crootof, 2015). The gap between policy aspirations and operational reality is significant. DoDD 3000.09 articulates the principle of appropriate human judgment, but commanders in the field lack validated frameworks for implementing that principle in the context of dynamic, high-tempo operations involving autonomous weapons (Filippi et al., 2026). International discussions emphasize meaningful human control, but no empirical research has defined what “meaningful” means in operational terms—how much control is enough, at what decision points human judgment is essential, and what mechanisms can ensure that control remains effective as the speed and complexity of operations increase. This governance gap provided a central motivation for the present research. The Human Dimension The challenge of governing autonomous weapons is not solely a matter of policy and technology; it is fundamentally a human factors problem. The effectiveness of any C2 architecture depends on the capacity of human operators to understand, trust, supervise, and when necessary override autonomous systems operating at speeds that may exceed human cognitive processing capabilities (Parasuraman et al., 2000). Research on trust in automation has consistently demonstrated that the human-machine relationship is mediated by complex psychological processes that do not always align with objective system performance (Lee & See, 2004; Hoff & Bashir, 2015). 33 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 In military contexts, trust dynamics are further complicated by organizational culture, hierarchical command relationships, combat stress, and the life-or-death consequences of decisions (Johnson, 2025). Military operators have been shown to engage AI systems as teammates rather than tools, activating social cognition pathways typically associated with human collaboration (de Visser et al., 2020). This finding suggests that trust calibration in military human-AI teaming cannot be reduced to simple performance metrics but must account for the relational, affective, and organizational dimensions of the human-machine partnership. Cognitive demands on operators overseeing autonomous systems represent another critical dimension. As autonomy increases, the operator’s role shifts from direct controller to supervisor and exception handler—a transition that paradoxically may increase cognitive demands even as it reduces physical workload (Endsley, 2017). The operator must maintain sufficient situational awareness to intervene effectively when the autonomous system encounters situations beyond its competence, while simultaneously monitoring multiple systems across potentially disparate operational contexts. This supervisory control challenge is compounded by the well-documented phenomenon of automation complacency, in which operators become less vigilant as their trust in the autonomous system grows (Wickens & Dixon, 2007). The phenomenon of automation bias—the tendency of human operators to favor automated recommendations even when those recommendations are incorrect—poses particular risks in autonomous weapons contexts (Wickens & Dixon, 2007). Military operators working under time pressure and information overload may be especially susceptible to automation bias, accepting AI targeting recommendations without the critical evaluation that meaningful human control requires. Conversely, automation-induced distrust—the rejection of AI recommendations based on a single prior failure—can lead operators to override autonomous systems 34 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 unnecessarily, degrading the operational tempo advantages that autonomy provides. Both phenomena reflect failures of trust calibration, and both have potentially lethal consequences in weapons employment scenarios. The design of C2 architectures must account for these human cognitive tendencies, building in safeguards against both over-reliance and under-reliance on autonomous capabilities. Perhaps most fundamentally, the employment of autonomous weapons raises the accountability question: when an autonomous system makes an engagement decision that results in unintended harm, who bears responsibility? The commander who authorized the system’s deployment? The operator who was supervising its operation? The engineer who designed its decision algorithms? The “responsibility gap”—the possibility that no human is meaningfully responsible for an autonomous system’s lethal actions—has been identified as one of the most significant ethical and legal challenges posed by autonomous weapons (Sparrow, 2007; Human Rights Watch, 2012). This accountability challenge is directly linked to the choice of C2 architecture: the degree and nature of human involvement in autonomous weapons decisions determines the extent to which meaningful accountability can be maintained. Statement of the Problem Despite the accelerating deployment of autonomous weapons systems across the arsenals of major military powers, the growing sophistication of AI-enabled command and control architectures, and the intensifying international discourse on autonomous weapons governance, no empirically validated framework exists for managing the dynamic transition of decision authority between human operators and autonomous weapons systems in military command and control environments. This absence constitutes the central problem that this dissertation addresses. 35 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 A systematic literature review of 67 peer-reviewed studies on human-AI collaboration in high-stakes military contexts (Pokorny, 2026) identified this gap explicitly, documenting that while concepts such as operator contestability (Veluwenkamp & Buijsman, 2025), meaningful human control (Siebert et al., 2023), and human-over-the-loop governance (Nadibaidze et al., 2025) have been proposed as design principles, none has been empirically tested under realistic operational conditions. The comprehensive literature review presented in Chapter 2 of this dissertation, which examined over 160 sources across multiple disciplines, confirmed and extended these findings, identifying critical gaps in five specific areas: (a) the absence of a dynamic autonomy management model for C2 environments that specifies transfer-of-control triggers, verification checkpoints, and fallback mechanisms (Gap C2-3); (b) the lack of empirical comparison of C2 architectures across military operation types with measurable outcomes for both effectiveness and accountability (Gap C2-5); (c) the need for validated protocols for human override in time-critical autonomous engagement scenarios (Gap AWS-3); (d) the absence of a framework for graduated responsibility allocation in autonomous weapons contexts (Gap AWS2); and (e) the reliance on self-reported trust measures rather than validated multi-modal measurement instruments (Gap TT-6). The problem is further compounded by the pace of technological change. The rapid advancement of AI capabilities—from narrow task-specific systems to more general multi-modal architectures—means that governance frameworks developed today must be sufficiently flexible to accommodate systems whose capabilities may exceed current expectations within years rather than decades. A static governance framework, however well-designed, will quickly become obsolete in the face of this technological acceleration. What is needed is a dynamic governance architecture that can adapt to evolving capabilities while maintaining the core principles of 36 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 meaningful human control, accountability, and compliance with international law. The absence of such an architecture is precisely the problem that this dissertation was designed to address. The problem is consequential for national security, operational effectiveness, and ethical governance. Without validated dynamic autonomy frameworks, military organizations risk deploying autonomous weapons under C2 architectures that are either too restrictive—forfeiting the speed advantages that autonomous systems offer—or too permissive—creating accountability gaps that undermine legal compliance, ethical standards, and public trust in the military’s use of lethal force. The Joint Chiefs of Staff and senior leaders of the joint military industrial base require empirical evidence, not theoretical conjecture, to make informed decisions about how autonomous weapons should be governed in operational contexts. This dissertation was conceived to provide that evidence. Purpose of the Study The purpose of this dissertation was to develop, test, and validate an empirically grounded Dynamic Autonomy Management (DAM) framework for human-AI command and control in autonomous weapons systems. The framework was designed to address the central problem identified above by providing (a) empirically validated principles for dynamically allocating decision authority between human commanders and autonomous weapons AI across different operational phases, (b) standardized transfer-of-control protocols that preserve meaningful human agency without degrading operational tempo below mission-critical thresholds, and (c) a comparative effectiveness model of human-in-the-loop, human-on-the-loop, and human-over-the-loop architectures with measurable outcomes for operational performance, trust calibration, and accountability traceability. 37 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The study was designed to produce findings directly actionable for the Joint Chiefs of Staff and senior leaders of the joint military industrial base. The grounded theory model provides the conceptual foundation for doctrinal updates. The agent-based computational model enables scenario-based policy analysis. The experimental results establish empirical benchmarks for C2 architecture performance. And the validated framework offers a ready-to-adopt governance tool for autonomous weapons employment. By integrating qualitative, computational, experimental, and validational methodologies in a sequential mixed-methods design, the study was positioned to produce a framework that is simultaneously theoretically grounded, computationally tested, experimentally verified, and operationally endorsed. Research Questions Three research questions guided this investigation. Each question addresses a distinct dimension of the dynamic autonomy management problem and is mapped to specific phases of the research design. Together, the three questions provide a comprehensive framework for investigating the allocation of decision authority between human operators and autonomous weapons systems in military command and control. Research Question 1 RQ1: What are the critical factors, triggers, and governance constraints that should govern dynamic transitions of decision authority between human operators and autonomous weapons systems in military command and control? This question addresses the foundational design space for dynamic autonomy management. It seeks to identify the conditions under which transitions between autonomy levels are warranted, the governance constraints that must bound such transitions, and the factors that determine optimal authority allocation across different operational phases—surveillance, 38 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 identification, tracking, engagement, and post-engagement assessment. RQ1 was addressed primarily through Phase 1 (qualitative grounded theory analysis of the 84-document policy and doctrinal corpus), which identified the thematic landscape, emergent categories, and transfer-ofcontrol triggers that formed the conceptual foundation for subsequent phases. Phase 2 (agentbased modeling) operationalized these qualitative findings into computational parameters, providing quantitative validation of the identified factors and triggers. The significance of RQ1 resides in its capacity to establish the conceptual and empirical foundation upon which the entire dynamic autonomy management framework rests. Without a clear understanding of the factors that should govern authority transitions—including threat characteristics, operational phase, system reliability, operator cognitive state, and governance constraints—any framework for dynamic autonomy would be built on assumption rather than evidence. The existing literature has proposed various concepts for managing human-AI authority allocation, including operator contestability (Veluwenkamp & Buijsman, 2025), meaningful human control properties (Siebert et al., 2023), and situational autonomy adaptation (de Visser et al., 2020), but none has been empirically grounded in a systematic analysis of the policy, doctrinal, and operational literature that defines the actual governance landscape for autonomous weapons. Research Question 2 RQ2: How do different levels of autonomy (HITL, HOTL, HOVL) affect human-AI team performance, trust calibration, and decision quality across varying operational tempos? This question addresses the comparative effectiveness of different C2 architectures under varying operational conditions. It seeks to quantify the tradeoffs between operational speed, decision quality, human trust, cognitive workload, and accountability that characterize each 39 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 architecture. RQ2 was addressed primarily through Phase 2 (agent-based computational modeling with 13,500 Monte Carlo iterations) and Phase 3 (simulation-based experimentation with 118 participants in a 3 × 3 factorial design). Phase 2 provided the computational baseline for architecture comparison, while Phase 3 introduced the human factors dimension—trust dynamics, cognitive load, and behavioral decision patterns—that computational models alone cannot capture. The significance of RQ2 derives from the absence of empirical data comparing C2 architectures across measurable performance dimensions. Prior to this research, the relative merits of HITL, HOTL, and HOVL architectures were debated on theoretical and normative grounds, with advocates of each architecture citing different priorities—accountability proponents favoring HITL, operational tempo advocates favoring HOVL, and compromise positions gravitating toward HOTL—without empirical evidence to adjudicate among these positions. The literature review in Chapter 2 documented this gap explicitly (Gap C2-5), noting that no empirical study had comparatively assessed these architectures with measurable outcomes for both operational effectiveness and accountability traceability. Research Question 3 RQ3: To what extent does an empirically derived dynamic autonomy management framework meet operational feasibility, doctrinal compatibility, and meaningful human control requirements as assessed by defense professionals? This question addresses the operational viability and practical applicability of the research outputs. It seeks to determine whether the DAM framework, derived from the cumulative evidence of Phases 1 through 3, meets the standards required for adoption in military practice. RQ3 was addressed through Phase 4 (tabletop exercise validation with 18 defense 40 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 professionals), in which subject matter experts evaluated the framework against five criteria: operational feasibility, doctrinal compatibility, decision traceability, meaningful human control preservation, and scalability. The expert validation provided the critical bridge between research findings and operational applicability. The significance of RQ3 is fundamentally practical. Research findings that cannot survive contact with operational reality—that are theoretically sound but operationally infeasible or doctrinally incompatible—will not influence the policy and practice decisions they are intended to inform. The defense research community has increasingly recognized the importance of validation by subject matter experts as a critical step in translating research into doctrine (Pokorny, 2026). RQ3 ensures that the DAM framework is evaluated not only by statistical criteria but also by the professional judgment of those who would ultimately implement it in operational environments. Table 1.1 Research Questions Mapped to Phases, Data Sources, and Methods Primary Research Question Data Sources Methods Phase(s) RQ1: Factors, triggers, and Phase 1, Phase 84-document corpus (DoD Grounded theory (open, governance constraints for 2 directives, GAO reports, CRS axial, selective coding); analyses, SIPRI data, RAND ABM parameterization dynamic autonomy transitions studies, international legal instruments) RQ2: Effects of autonomy levels Phase 2, Phase 13,500 Monte Carlo iterations; Agent-based modeling; on performance, trust, and 3 118 experimental participants ANOVA, MANOVA, across 3 × 3 factorial design post-hoc comparisons, decision quality effect size analysis RQ3: Operational feasibility and Phase 4 18 defense professionals 41 Tabletop exercise; DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 doctrinal compatibility of the (military officers, defense Likert-scale evaluation; DAM framework civilians, defense industry ICC reliability analysis; experts) qualitative feedback coding Note. ABM = agent-based model; ANOVA = analysis of variance; MANOVA = multivariate analysis of variance; ICC = intraclass correlation coefficient; DAM = Dynamic Autonomy Management; DoD = Department of Defense; GAO = Government Accountability Office; CRS = Congressional Research Service; SIPRI = Stockholm International Peace Research Institute. Significance of the Study Theoretical Significance This dissertation makes several contributions to theory that extend existing knowledge in human-AI teaming, trust calibration, dynamic autonomy, and command and control. First, the research provides the first empirical integration of five foundational theoretical frameworks— Levels of Automation theory (Parasuraman et al., 2000), Trust in Automation (Lee & See, 2004), Meaningful Human Control (Santoni de Sio & van den Hoven, 2018), Naturalistic DecisionMaking (Klein, 1998), and C2 Agility (Alberts, 2011)—into a unified framework validated against military operational requirements. While these theories have been individually welldeveloped, no prior research has integrated them into a cohesive model for dynamic autonomy management in weapons employment contexts. Second, the research identified and quantified the speed–accountability tradeoff as the central design constraint for dynamic autonomy systems. The finding that increasing autonomy from HITL to HOVL reduces response latency by 85.9% while simultaneously degrading accountability chain integrity by 30.3 percentage points provides the first empirical parameterization of this tradeoff, establishing a quantitative foundation for theoretical models of human-machine authority allocation. Third, the discovery of the trust–accuracy paradox— 42 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 wherein higher-autonomy systems produce objectively better performance but engender lower operator trust—challenges the assumption implicit in much automation theory that trust should increase with system capability. This finding has implications for trust calibration theory that extend well beyond the military domain. Fourth, the research bridges the gap between abstract automation theory developed primarily in laboratory settings and the operational reality of military C2, demonstrating that theoretical constructs such as levels of automation and trust calibration can be meaningfully applied to the high-stakes, high-tempo, and high-consequence environment of autonomous weapons employment. This bridge is essential for the continued relevance of human factors theory in an era when autonomous systems are moving from controlled research environments to contested operational domains. Practical Significance The practical significance of this research resides in the actionable tools it provides for military operators, system designers, and training program developers. The DAM framework offers a structured, empirically validated architecture for designing C2 systems that dynamically allocate decision authority between humans and autonomous weapons based on operational conditions, threat tempo, and governance constraints. For military operators, the framework provides clear guidance on when and how autonomy transitions should occur, what verification checkpoints must be maintained, and what fallback mechanisms ensure meaningful human control even under high-tempo conditions. For system designers, the research provides empirical benchmarks for C2 architecture performance. The finding that HOTL achieves 86.3% mission success with 86.3% accountability integrity and 2.70-second response latency—compared to HITL’s 71.6% mission success with 43 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 97.8% accountability at 8.51 seconds and HOVL’s 89.3% mission success with 68.2% accountability at 1.20 seconds—provides quantitative parameters for engineering design decisions. For training programs, the findings on cognitive load interactions (autonomy level × threat tempo interaction η²p = .16) suggest specific areas where operator training must be adapted for different C2 configurations. The practical significance extends to the acquisition and procurement processes that shape how autonomous weapons systems are developed and fielded. Defense acquisition programs for autonomous weapons currently lack validated performance specifications for human-AI C2 integration—there are requirements for system speed, reliability, and lethality but no equivalent requirements for governance architecture performance, trust calibration, or dynamic authority management. The DAM framework and its associated empirical benchmarks provide a basis for developing such requirements, enabling acquisition professionals to specify not only what autonomous weapons must do but how human control over those systems must be maintained. This contribution addresses a gap in the acquisition process that has significant implications for the development of responsible autonomous weapons. Policy Significance The policy implications of this research are direct and consequential. First, the DAM framework provides empirically grounded guidance for the implementation and potential revision of DoDD 3000.09. The directive’s requirement for “appropriate levels of human judgment” can now be operationalized through the framework’s specification of autonomy levels, transition triggers, and accountability mechanisms validated across four research phases. Second, the framework directly informs Joint Chiefs of Staff decision-making on autonomous weapons governance by providing the empirical evidence base that doctrine development 44 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 requires. The finding that HOTL represents the optimal default architecture—balancing speed, accountability, and operator trust—provides a concrete recommendation for doctrinal guidance. Third, the research strengthens the United States’ position in international LAWS negotiations by demonstrating that meaningful human control and operational effectiveness are not mutually exclusive but can be balanced through carefully designed dynamic autonomy systems. The empirical evidence that accountability chain integrity can be maintained at high levels (86.3% under HOTL) while achieving near-autonomous response times provides a counter to the argument that human control necessarily degrades military capability. Fourth, for the defense industrial base, the framework provides design requirements and performance specifications that can guide the development of next-generation autonomous weapons C2 systems, aligning industrial design with validated governance principles. The policy significance also encompasses the evolving relationship between the Department of Defense and Congress regarding autonomous weapons oversight. Congressional interest in autonomous weapons governance has increased substantially in recent years, driven by both the pace of technological development and high-profile public debates about the ethics of autonomous weapons (Congressional Research Service, 2024). The DAM framework provides a common vocabulary and empirical foundation for legislative-executive dialogue on autonomous weapons governance, moving the conversation from abstract ethical debates to evidence-based policy discussions. The framework’s specification of measurable performance metrics—accountability chain integrity, response latency, decision quality, and trust calibration—provides concrete parameters for oversight mechanisms and reporting requirements. 45 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Societal Significance Beyond its military and policy implications, this research addresses questions of fundamental societal importance. Democratic governance requires that the use of lethal force by the state be subject to meaningful human oversight and accountability. The deployment of autonomous weapons that can select and engage targets without direct human authorization raises profound questions about the democratic accountability of military action (Human Rights Watch, 2012). The DAM framework’s emphasis on accountability chain integrity—and its empirical demonstration that such integrity can be maintained across different autonomy configurations—contributes to the preservation of democratic accountability in an era of increasingly autonomous military technology. The research also addresses the preservation of human dignity in warfare. International humanitarian law rests on the premise that decisions involving the life and death of individuals should be made by moral agents capable of exercising judgment, compassion, and restraint (UNIDIR, 2025). By establishing empirically validated mechanisms for maintaining meaningful human control over autonomous weapons decisions, this research supports the continued relevance of these principles in an age of artificial intelligence. Finally, by demonstrating that responsible governance of autonomous weapons is achievable through rigorous empirical research, the dissertation contributes to public trust in the military’s capacity to develop and deploy these technologies in a manner consistent with democratic values and international norms. The societal significance of this research extends to the broader relationship between artificial intelligence and democratic governance. The autonomous weapons domain represents perhaps the highest-stakes arena in which AI systems are being entrusted with consequential 46 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 decisions, but the principles at stake—accountability, transparency, human oversight, and the appropriate allocation of authority between human judgment and machine capability—are relevant across every domain in which AI is being deployed. The frameworks and empirical findings developed in this dissertation, while specific to military autonomous weapons, contribute to the broader societal conversation about how democratic societies should govern the delegation of consequential decisions to artificial intelligence systems. In this sense, the autonomous weapons governance challenge is a bellwether for the governance challenges that AI will pose across transportation, healthcare, criminal justice, and other domains where machine decisions affect human welfare. Theoretical Framework This dissertation is grounded in an integrated theoretical framework that draws on five foundational bodies of theory, each addressing a distinct dimension of the dynamic autonomy management problem. The integration of these theories provides the conceptual architecture that guided the research design, informed the operationalization of constructs, and framed the interpretation of findings. No single theory is sufficient to address the multifaceted challenge of governing human-AI authority allocation in autonomous weapons C2; the integrated framework reflects the inherently interdisciplinary nature of this research problem. The first theoretical foundation is the Levels of Automation (LOA) framework, originally articulated by Sheridan and Verplank (1978) and subsequently refined by Parasuraman, Sheridan, and Wickens (2000). The LOA framework conceptualizes automation as a continuum ranging from full human control to full machine autonomy, with intermediate levels representing different allocations of functions between human and machine across four informationprocessing stages: information acquisition, information analysis, decision selection, and action 47 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 implementation. This framework provides the conceptual basis for distinguishing among the three C2 architectures investigated in this dissertation—HITL, HOTL, and HOVL—and for understanding the human factors implications of moving along the autonomy continuum. The second foundation is Trust in Automation theory, drawing primarily on the seminal framework of Lee and See (2004) and the dispositional-situational-learned trust model of Hoff and Bashir (2015). Trust in automation is conceptualized as a multidimensional attitude reflecting the operator’s willingness to rely on the automated system, shaped by the system’s performance, process, and purpose attributes. In military contexts, trust calibration—the alignment between the operator’s trust in the system and the system’s actual trustworthiness—is critical because both over-trust (leading to complacency and failures of oversight) and undertrust (leading to unnecessary human intervention that degrades operational tempo) can have lethal consequences. This dissertation’s investigation of trust dynamics across different autonomy levels and threat tempos is directly grounded in this theoretical tradition. The third foundation is Meaningful Human Control (MHC), a concept developed by Santoni de Sio and van den Hoven (2018) that has become central to international autonomous weapons governance discourse. MHC posits that human control over autonomous systems is “meaningful” when the human agent possesses both tracking control (the ability to monitor and understand the system’s behavior) and tracing control (the ability to be held responsible for the system’s actions because the outcomes can be traced to the human’s decisions). This framework provides the normative foundation for the dissertation’s emphasis on accountability chain integrity as a primary performance metric for C2 architectures. The fourth foundation is Naturalistic Decision-Making (NDM), particularly the Recognition-Primed Decision model articulated by Klein (1998). NDM theory describes how 48 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 experienced professionals make decisions in complex, time-pressured, high-stakes environments—precisely the conditions characterizing autonomous weapons employment. NDM emphasizes that expert decision-making relies on pattern recognition, mental simulation, and satisficing rather than the analytical comparison of options assumed by classical decision theory. This framework informed the dissertation’s attention to cognitive load, situation awareness, and the compatibility of autonomous system outputs with naturalistic human decision processes. The fifth foundation is C2 Agility, as articulated by Alberts (2011), which conceptualizes effective command and control as the capacity to adapt C2 arrangements to match the demands of the operational environment. C2 Agility theory argues that no single C2 structure is optimal across all conditions; rather, effective C2 requires the ability to dynamically shift among different structural configurations as circumstances change. This framework provides the direct theoretical warrant for the dissertation’s central premise: that dynamic transitions among HITL, HOTL, and HOVL architectures, governed by validated triggers and protocols, can achieve superior outcomes compared to any fixed C2 configuration. The integration of these five theoretical foundations is not merely additive but synergistic. The Levels of Automation framework defines the structural options available (HITL, HOTL, HOVL); Trust in Automation theory explains how human operators relate to and rely upon those structures; Meaningful Human Control establishes the normative criteria that any structure must satisfy; Naturalistic Decision-Making describes how human decision-makers actually behave within those structures under operational stress; and C2 Agility provides the theoretical warrant for dynamic transitions among structures. Together, these theories create a comprehensive framework that addresses the structural, psychological, normative, behavioral, and organizational dimensions of dynamic autonomy management. Each research question draws 49 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 on multiple theoretical foundations: RQ1 engages LOA, MHC, and C2 Agility; RQ2 engages LOA, Trust in Automation, and NDM; and RQ3 engages MHC, C2 Agility, and Trust in Automation. This cross-cutting theoretical engagement ensures that the research findings are situated within a rich, multidimensional theoretical context. Figure 1.1 Integrated Theoretical Framework for Dynamic Autonomy Management INTEGRATED THEORETICAL FRAMEWORK Levels of Automation (Parasuraman et al., 2000) → Defines the HITL–HOTL–HOVL autonomy spectrum → RQ1, RQ2 Trust in Automation (Lee & See, 2004; Hoff & Bashir, 2015) → Governs trust calibration across autonomy levels → RQ2 Meaningful Human Control (Santoni de Sio & van den Hoven, 2018) → Establishes accountability and governance requirements → RQ1, RQ3 Naturalistic Decision-Making (Klein, 1998) → Models human decision behavior under operational stress → RQ2 C2 Agility (Alberts, 2011) → Provides warrant for dynamic autonomy transitions → RQ1, RQ3 ──────── CONVERGENCE ──────── Dynamic Autonomy Management (DAM) Framework Speed – Accountability – Trust – Decision Quality Note. The five foundational theories converge to address distinct dimensions of the dynamic autonomy management problem. LOA = Levels of Automation; NDM = Naturalistic 50 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Decision-Making; MHC = Meaningful Human Control; C2 = command and control; HITL = human-in-the-loop; HOTL = human-on-the-loop; HOVL = human-over-the-loop. Research Design Overview This dissertation employed a four-phase sequential mixed-methods design that integrates qualitative grounded theory development, agent-based computational modeling, simulationbased experimentation, and tabletop exercise validation. The design follows an exploratory sequential structure (Creswell & Plano Clark, 2018) in which each phase builds upon the outputs of its predecessor, creating a cumulative evidence base that moves from qualitative exploration through computational testing and experimental verification to operational validation. The logic of this sequential design is that complex, multifaceted research problems in applied defense fields require the complementary strengths of multiple methodological traditions, integrated through deliberate sequencing and explicit points of methodological connection. The mixed-methods design was chosen specifically to address the multidimensional nature of the research problem. The governance of autonomous weapons involves normative questions (what ought to be) that are best addressed through qualitative analysis of policy and doctrinal discourse, computational questions (what would happen) that require simulation modeling, empirical questions (what does happen) that demand controlled experimentation, and practical questions (what can work) that necessitate expert validation. No single methodological tradition can address all four dimensions. The sequential design ensures that findings are not only internally consistent within each phase but also externally validated through methodological triangulation across phases—a critical requirement for research intended to inform high-stakes policy decisions. 51 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Phase 1: Qualitative Grounded Theory Analysis Phase 1 applied constructivist grounded theory methods (Charmaz, 2014) to a corpus of 84 documents comprising Department of Defense directives, Government Accountability Office reports, Congressional Research Service analyses, RAND Corporation studies, SIPRI publications, UNIDIR reports, and international legal instruments. Through systematic open, axial, and selective coding, this phase identified 19 thematic codes organized into eight emergent categories, with Autonomy Governance emerging as the core category with the highest centrality score. Phase 1 outputs—including transfer-of-control triggers, governance constraints, and the emergent theoretical model—served as direct inputs for parameterizing the Phase 2 agent-based model. The document corpus was assembled through systematic search of institutional repositories including the Defense Technical Information Center, the Government Accountability Office, the Congressional Research Service, RAND Corporation, the Center for a New American Security, SIPRI, and UNIDIR. Documents were selected based on their relevance to autonomous weapons governance, C2 architecture design, human-AI teaming in military contexts, and international legal frameworks for autonomous weapons. The coding process followed the constant comparative method (Charmaz, 2014), with iterative rounds of coding, memoing, and theoretical sampling ensuring that the emergent categories were grounded in the data rather than imposed from prior theory. Phase 2: Agent-Based Computational Modeling Phase 2 translated the qualitative findings of Phase 1 into a computational simulation framework, constructing an agent-based model (ABM) that simulated human-AI C2 interactions across 13,500 Monte Carlo iterations (1,000 iterations per condition across three C2 architectures 52 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 and three threat tempo levels, with additional sensitivity analyses). The ABM quantified the speed–accountability tradeoff with statistical precision, establishing performance benchmarks for each architecture: HITL (97.8% accountability, 8.51s latency, 71.6% mission success), HOTL (86.3% accountability, 2.70s latency, 86.3% mission success), and HOVL (68.2% accountability, 1.20s latency, 89.3% mission success). These computational results generated specific hypotheses for experimental testing in Phase 3. The ABM employed a stochastic simulation architecture in which autonomous agents representing human operators, AI systems, and target entities interacted according to rules derived from Phase 1 qualitative findings. Each Monte Carlo iteration simulated a complete engagement cycle from target detection through post-engagement assessment, with random variation in threat parameters, operator response characteristics, and environmental conditions. Sensitivity analyses examined the impact of key parameters on model outcomes, with threat tempo identified as the highest-impact parameter—a finding that directly informed the experimental design of Phase 3. Phase 3: Simulation-Based Experimentation Phase 3 conducted a 3 × 3 between-subjects factorial experiment with 118 participants, crossing three autonomy levels (HITL, HOTL, HOVL) with three threat tempo conditions (low, medium, high). Dependent variables included decision accuracy, response time, trust score (measured on a 7-point Likert scale), cognitive load (NASA-TLX), and rules of engagement compliance. Two-way ANOVA revealed large effects of autonomy level on response time (η²p = .73) and significant autonomy × tempo interactions on cognitive load (η²p = .16). The experimental results confirmed the computational predictions of Phase 2 while adding the human 53 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 factors dimension—trust dynamics, cognitive load patterns, and behavioral decision strategies— that computational models alone cannot capture. Phase 4: Expert Tabletop Validation Phase 4 convened expert tabletop exercises with 18 defense professionals—comprising active-duty and recently retired military officers, senior defense civilians, and defense industry technical leaders—to evaluate the DAM framework against five criteria: operational feasibility (M = 5.17, SD = 0.99), doctrinal compatibility (M = 5.50, SD = 0.79), decision traceability (M = 5.83, SD = 0.62), meaningful human control preservation (M = 5.56, SD = 1.20), and scalability (M = 4.72, SD = 1.18), all on a 7-point scale. All criteria were rated significantly above neutral (p < .001), with decision traceability receiving the highest endorsement and scalability identified as the primary area for future development. 54 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Figure 1.2 Four-Phase Sequential Mixed-Methods Research Design Overview FOUR-PHASE SEQUENTIAL MIXED-METHODS DESIGN Phase 1: Qualitative Grounded Theory N = 84 documents Phase 2: Agent-Based Modeling 13,500 Monte Carlo iterations 19 thematic codes, 8 categories Core: Autonomy Governance 3 architectures × 3 conditions Speed–accountability quantified │ │ └──── Parameterizes ─────┘ Generates hypotheses ───┐ Phase 3: Experimental Testing Phase 4: Expert Validation N = 118 participants N = 18 defense professionals 3 × 3 factorial design 5 evaluation criteria Large effects confirmed All criteria > neutral (p<.001) │ │ └── Informs framework ─────────┘ │ DAM FRAMEWORK VALIDATED Note. Each phase builds upon its predecessor. Arrows indicate the flow of outputs that serve as inputs for subsequent phases. Phase 1 themes parameterize Phase 2 models; Phase 2 predictions generate Phase 3 hypotheses; Phases 1–3 results inform Phase 4 validation criteria. N = sample size; ABM = agent-based model. The reader is directed to Chapter 3 for complete methodological details, including the philosophical foundations of the mixed-methods approach, detailed descriptions of sampling strategies, data collection procedures, operational definitions, analytical methods, and reliability and validity considerations for each phase. 55 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Scope and Delimitations The scope of this dissertation was deliberately defined to ensure methodological rigor and practical relevance while acknowledging the boundaries of what a single doctoral investigation can accomplish. The following delimitations describe the choices made to focus the research and should be understood as intentional design decisions rather than limitations. The geographic and institutional scope of this research is centered on the United States military context, including U.S. Department of Defense policy, U.S. military doctrine, and the perspectives of defense professionals working within or closely aligned with U.S. defense institutions. While the literature review and qualitative analysis included international sources— particularly from NATO allies, the United Nations, and international research institutes—the governance framework was designed with primary reference to U.S. legal authorities, doctrinal traditions, and command structures. This focus reflects the intended audience of the research: the Joint Chiefs of Staff and senior leaders of the U.S. joint military industrial base. The system scope encompasses autonomous and semi-autonomous weapons systems as defined by DoDD 3000.09, including systems that can select and engage targets with or without human intervention. The research does not address fully autonomous general artificial intelligence systems that do not yet exist, nor does it address non-weapons autonomous military systems (e.g., autonomous logistics vehicles, surveillance-only platforms) except insofar as the C2 principles developed may be generalizable to such systems. The data scope is limited to publicly available, unclassified sources. This delimitation ensures broad dissemination of findings and scholarly reproducibility but means that classified operational data, intelligence assessments, and restricted technical specifications that may inform real-world autonomous weapons governance are not incorporated into the analysis. The temporal 56 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 scope of the literature review encompasses sources published between 2018 and early 2026, reflecting the rapid evolution of both the technology and the governance discourse. The methodological scope relies on simulated rather than live operational data for Phases 3 and 4. The experimental phase employed simulation-based scenarios rather than live autonomous weapons operations, and the validation phase employed tabletop exercises rather than field deployments. These choices reflect the practical and ethical constraints of research involving autonomous weapons—live testing with lethal autonomous systems is neither feasible nor appropriate in a doctoral research context—while still providing rigorous empirical evidence through validated simulation methodologies. An additional delimitation concerns the level of analysis. This research examines dynamic autonomy management at the level of individual human-platform interaction—a single human operator or command team interacting with a single autonomous weapons system or a small number of coordinated systems. The scaling of dynamic autonomy management to theaterlevel operations involving hundreds of autonomous platforms across multiple domains is beyond the scope of this research, though the framework is designed to provide a foundational architecture that can be extended to larger scales. The Phase 4 expert validation identified scalability as the primary area requiring future research, with experts rating it the lowest of the five evaluation criteria (M = 4.72, SD = 1.18). Future research should explicitly address the multi-platform, multi-domain scaling challenge that this delimitation acknowledges. Definition of Key Terms The following definitions establish the operational meaning of key terms used throughout this dissertation. Each definition is grounded in the scholarly or policy literature and cited 57 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 accordingly. Consistent use of these definitions throughout the dissertation ensures terminological precision and facilitates comparison with existing research. Table 1.2 Definition of Key Terms Term Definition Autonomous Weapons A weapon system that, once activated, can select and engage targets without System (AWS) further intervention by a human operator (U.S. Department of Defense, 2023). Human-in-the-Loop (HITL) A C2 architecture in which a human operator must authorize each critical decision, particularly engagement decisions, before the system executes (Parasuraman et al., 2000). Human-on-the-Loop (HOTL) A C2 architecture in which the autonomous system can initiate actions autonomously while the human operator maintains supervisory oversight and the ability to intervene or abort (Nadibaidze et al., 2025). Human-over-the-Loop A C2 architecture in which the human sets strategic parameters, governance (HOVL) constraints, and rules within which the autonomous system operates independently (Nadibaidze et al., 2025). Dynamic Autonomy The capacity of a human-AI system to transition between different levels of autonomy based on operational conditions, threat environment, and governance requirements (Alberts, 2011). Meaningful Human Control Human control over an autonomous system is meaningful when the human (MHC) possesses both tracking control (monitoring and understanding) and tracing control (accountability for outcomes; Santoni de Sio & van den Hoven, 2018). Transfer-of-Control The process by which decision authority is shifted from one autonomy level to another, governed by specified triggers, verification checkpoints, and fallback mechanisms (Sheridan & Verplank, 1978). Trust Calibration The alignment between an operator's trust in an automated system and the system's actual trustworthiness, encompassing both over-trust and under-trust (Lee & See, 2004). 58 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Command and Control (C2) The exercise of authority and direction by a properly designated commander over assigned and attached forces in the accomplishment of a mission (U.S. Army, 2019). Joint All-Domain Command The Department of Defense concept for a networked, AI-enabled C2 architecture and Control (JADC2) integrating sensors, shooters, and decision-makers across all warfighting domains (U.S. Department of Defense, 2022). Rules of Engagement (ROE) Directives issued by competent military authority that delineate the circumstances and limitations under which forces will initiate or continue combat engagement (U.S. Army, 2019). Accountability Chain The degree to which a clear, traceable line of decision authority and Integrity responsibility is maintained from the commanding officer through the autonomous system to the engagement outcome. Agent-Based Modeling A computational modeling approach that simulates the actions and interactions of (ABM) autonomous agents to assess their effects on the system as a whole (Bonabeau, 2002). Decision Quality A composite metric encompassing decision accuracy, timeliness, compliance with rules of engagement, and alignment with mission objectives. Cognitive Load The total amount of mental effort being used in working memory during a task, measured in this study using the NASA Task Load Index (Hart & Staveland, 1988). Note. Definitions reflect operational usage in this dissertation. Some terms have broader meanings in other contexts. C2 = command and control; C2 architecture definitions adapted from the typology presented in Nadibaidze et al. (2025) and Parasuraman et al. (2000). Assumptions Several assumptions underlie this research and should be made explicit. First, this study assumes that publicly available, unclassified data—including Department of Defense directives, Government Accountability Office reports, Congressional Research Service analyses, and 59 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 publications from international research institutes—adequately represents the essential dynamics of autonomous weapons governance, even though classified operational data and technical specifications are not accessible. This assumption is supported by the breadth and depth of the unclassified literature, which has been the basis for policy development and scholarly analysis in this domain for over a decade. Second, the study assumes that simulated experimental participants approximate the decision-making behaviors and cognitive processes of real military operators interacting with autonomous weapons systems. While simulation-based research cannot fully replicate the stress, fatigue, moral weight, and organizational dynamics of live operations, the simulation-based experimental paradigm is the established standard in defense-related human factors research (Creswell & Plano Clark, 2018) and has been validated as a reasonable proxy for operational behavior in numerous prior studies. Third, the study assumes that the agent-based computational model captures the essential dynamics of human-AI C2 interactions with sufficient fidelity to generate meaningful predictions about architecture performance. Agent-based modeling is an established methodology for studying complex adaptive systems (Bonabeau, 2002), and the model’s parameters were directly derived from the qualitative findings of Phase 1, ensuring theoretical grounding. The convergence of ABM predictions with experimental results in Phase 3 provides post hoc validation of this assumption. Fourth, the study assumes that the 18 defense professionals who participated in the Phase 4 tabletop exercises possess sufficient expertise and representativeness to provide valid assessments of the DAM framework’s operational feasibility and doctrinal compatibility. The 60 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 purposive sampling strategy and the achieved intraclass correlation coefficients reported in Chapter 4 support this assumption. Fifth, the study assumes that the three C2 architectures investigated—human-in-the-loop, human-on-the-loop, and human-over-the-loop—represent the operationally relevant range of human-AI authority allocation for autonomous weapons systems. While additional intermediate configurations exist, and fully autonomous operation without any human involvement represents a theoretical endpoint of the autonomy spectrum, the three architectures selected for this study capture the primary governance modes currently discussed in policy, doctrinal, and scholarly literature. Sixth, the study assumes that the five evaluation criteria used in Phase 4—operational feasibility, doctrinal compatibility, decision traceability, meaningful human control preservation, and scalability—capture the dimensions most relevant to operational adoption of a dynamic autonomy framework. These criteria were derived from the Phase 1 qualitative analysis and validated through preliminary consultation with defense professionals prior to the tabletop exercises. Organization of the Dissertation This dissertation is organized into six chapters that follow a logical progression from theoretical foundation through empirical investigation to synthesis and conclusion. The structure reflects the sequential mixed-methods design, with each chapter building upon the evidence and analysis presented in preceding chapters. 61 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 2.3 Organization of the Dissertation Chapter Title Purpose Key Content 1 Introduction Establish the research Background, problem statement, significance, problem, purpose, questions, theoretical framework, research design and design overview, scope and delimitations, key terms Provide comprehensive Theoretical foundations, AWS development, synthesis of relevant human-AI teaming, trust dynamics, dynamic scholarship across disciplines autonomy, C2 theory, MHC frameworks, 2 Literature Review legal/ethical analysis, XAI, computational modeling; 160+ sources 3 Methodology Detail the four-phase Research philosophy, design rationale, Phase sequential mixed-methods 1–4 procedures, sampling, instruments, data research design analysis plans, reliability/validity, ethical safeguards, limitations 4 Results Present findings from all four Phase 1 grounded theory categories; Phase 2 research phases ABM performance metrics; Phase 3 ANOVA/MANOVA results; Phase 4 expert ratings; cross-phase integration 5 Discussion Interpret findings, present Interpretation by RQ, integrated DAM DAM framework, discuss framework, theoretical/practical/policy implications implications, limitations, future research directions 6 Conclusion Synthesize contributions and Summary of findings, DAM framework issue call to action architecture, contributions to field, recommendations for policy and practice, closing statement 62 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. AWS = autonomous weapons systems; ABM = agent-based model; MHC = meaningful human control; XAI = explainable artificial intelligence; ANOVA = analysis of variance; MANOVA = multivariate analysis of variance; DAM = Dynamic Autonomy Management; RQ = research question. Chapter 2 presents the comprehensive literature review, synthesizing over 160 sources across ten thematic areas to establish the theoretical and empirical foundations for the research. The review identifies the critical research gaps that motivated the study and presents the integrated conceptual framework. Chapter 3 details the methodology, providing complete descriptions of the four-phase sequential mixed-methods design, including sampling strategies, data collection procedures, analytical methods, and reliability and validity safeguards for each phase. Chapter 4 reports the results of all four phases, organized sequentially and concluding with a cross-phase integration section that maps convergent and divergent findings to each research question. Chapter 5 presents the discussion, interpreting findings in light of the theoretical framework, presenting the integrated DAM framework, and addressing theoretical, practical, and policy implications alongside study limitations and future research directions. Chapter 6 provides the conclusion, synthesizing the dissertation’s contributions and issuing a call to action for the defense policy community. The logic connecting these chapters reflects the sequential mixed-methods design. Chapter 2 identifies the theoretical foundations and research gaps that motivate the study. Chapter 3 describes the methodology designed to address those gaps. Chapter 4 presents what the research found. Chapter 5 interprets what the findings mean—for theory, practice, and policy. Chapter 6 synthesizes the contributions and articulates the implications for the defense policy community. This structure ensures that each chapter contributes a distinct element to the cumulative argument while building upon the evidence and analysis of its predecessors. The 63 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 reader who proceeds sequentially through the chapters will encounter a progressively more complete and nuanced understanding of the dynamic autonomy management problem and its empirically validated solution. Chapter Summary This chapter has established the foundation for a doctoral investigation into dynamic autonomy management in human-AI command and control for autonomous weapons systems. The accelerating deployment of autonomous military technologies, the transformation of command and control architectures through AI integration, and the intensifying governance discourse at national and international levels create an urgent need for the empirically validated framework that this research provides. The central problem—the absence of validated frameworks for managing dynamic transitions of decision authority between human operators and autonomous weapons systems— was grounded in a systematic literature review and comprehensive gap analysis. The purpose of the study—to develop, test, and validate a Dynamic Autonomy Management framework—was operationalized through three research questions addressed by a four-phase sequential mixedmethods design integrating qualitative, computational, experimental, and validational approaches. The significance of the research spans theoretical, practical, policy, and societal dimensions, with findings directly relevant to Joint Chiefs of Staff doctrine development, DoDD 3000.09 implementation, and international autonomous weapons governance. The research design overview presented in this chapter previewed the four-phase sequential mixed-methods approach—from qualitative grounded theory through agent-based modeling and simulation-based experimentation to expert tabletop validation—that forms the empirical backbone of the dissertation. Key numbers emerging from this investigation include 64 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 the quantification of the speed–accountability tradeoff (HITL: 97.8% accountability at 8.51s latency; HOVL: 68.2% at 1.20s), the identification of HOTL as the optimal default architecture (86.3% mission success, 86.3% accountability, 2.70s latency), and the expert endorsement of the DAM framework across all five evaluation criteria (all rated significantly above neutral, p < .001). These findings, detailed in Chapters 4 and 5, provide the empirical foundation for the governance architecture that this dissertation contributes to the field. The integrated theoretical framework—drawing on Levels of Automation, Trust in Automation, Meaningful Human Control, Naturalistic Decision-Making, and C2 Agility theories—provides the conceptual architecture guiding the research. The scope and delimitations define the boundaries of the investigation, and the key terms establish the terminological precision necessary for rigorous scholarship. Chapter 2 presents the comprehensive literature review that extends and deepens the theoretical foundations introduced here. 65 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 CHAPTER 2: LITERATURE REVIEW Introduction to the Literature Review The accelerating integration of artificial intelligence into military command and control architectures has generated a vast and rapidly expanding body of scholarship spanning multiple disciplines, including computer science, military science, international law, ethics, human factors engineering, and organizational psychology. This literature review provides a comprehensive, critical synthesis of the scholarly and policy literature relevant to the central research problem of this dissertation: the absence of empirically validated frameworks for dynamically managing the allocation of decision authority between human commanders and autonomous weapons systems across the spectrum of military operations. The review encompasses over 160 sources drawn from peer-reviewed journals, government reports, think tank analyses, military doctrine, international legal instruments, and conference proceedings, reflecting the inherently interdisciplinary nature of autonomous weapons governance and human-AI teaming in highstakes environments. The search strategy employed for this review followed a systematic approach consistent with best practices in defense-related research synthesis. The primary databases searched included IEEE Xplore, ACM Digital Library, Web of Science, Scopus, ScienceDirect, SpringerLink, and Google Scholar. Additionally, institutional repositories including the Defense Technical Information Center (DTIC), RAND Corporation, the Center for a New American Security (CNAS), the Stockholm International Peace Research Institute (SIPRI), and the United Nations Institute for Disarmament Research (UNIDIR) were systematically searched for grey literature and policy documents. Search terms included combinations of autonomous weapons systems, human-AI teaming, command and control, dynamic autonomy, meaningful human 66 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 control, levels of automation, trust in automation, military robotics, and explainable AI, among others. Inclusion criteria required that sources address human interaction with autonomous or semi-autonomous systems in military or defense contexts, contribute theoretical frameworks applicable to autonomy management, or examine legal, ethical, or governance dimensions of autonomous weapons employment. Sources were excluded if they addressed purely technical AI development without human factors considerations, focused exclusively on civilian applications without defense relevance, or consisted of non-peer-reviewed opinion pieces lacking empirical or theoretical substance. The chapter is organized into ten major thematic sections that collectively build the theoretical and empirical foundation for the present research. The review begins with the theoretical foundations that undergird the study, including systems theory, human factors and cognitive engineering, naturalistic decision-making, trust theory, and levels of automation frameworks. It then examines the development and classification of autonomous weapons systems, followed by a comprehensive treatment of human-AI teaming and collaboration, with particular attention to trust dynamics in military contexts. The review proceeds to examine dynamic autonomy and adaptive control mechanisms, command and control theory in the age of AI, meaningful human control and governance frameworks, legal and ethical considerations, explainable AI and transparency requirements, and computational modeling approaches. The chapter concludes with a synthesis of key findings, identification of critical research gaps, and presentation of the conceptual framework guiding the present study. Throughout, the review adopts a critical analytic stance, identifying not only what the literature establishes but also where significant disagreements, methodological limitations, and empirical gaps persist. 67 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The conceptual framework guiding this review integrates three interconnected dimensions that the literature consistently identifies as essential for successful human-AI collaboration in military settings: technological capability, human-factors alignment, and ethicallegal compliance (Pokorny, 2026). Systems optimizing for any single dimension at the expense of others consistently fail to achieve their intended operational objectives. This tripartite framework structures the review's analysis, ensuring that each body of literature is examined not only on its own terms but also in relation to the other two dimensions. The overarching argument advanced through this review is that dynamic autonomy management represents the critical integrating mechanism through which these three dimensions can be harmonized in practice, and that the absence of empirically validated frameworks for such management constitutes the most consequential gap in the current literature. Theoretical Foundations The study of dynamic autonomy management in human-AI command and control for autonomous weapons systems draws upon a rich tapestry of theoretical traditions. This section examines five foundational theoretical frameworks that collectively provide the conceptual architecture for the present research: systems theory and sociotechnical systems, human factors and cognitive engineering, naturalistic decision-making, trust theory, and levels of automation frameworks. Each framework illuminates different dimensions of the complex interaction between human decision-makers and autonomous systems in military contexts, and their integration provides the multidimensional lens required to address the dissertation's research questions. 68 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Systems Theory and Sociotechnical Systems The integration of autonomous weapons into military command and control architectures is fundamentally a systems problem, requiring analytical frameworks capable of addressing emergent properties, nonlinear interactions, and adaptive behaviors that characterize complex sociotechnical systems. General systems theory, as articulated by von Bertalanffy and subsequently developed across multiple disciplines, provides the foundational epistemological orientation for understanding how human operators and autonomous systems function not as isolated components but as interdependent elements within a larger operational whole. The sociotechnical systems perspective extends this insight by recognizing that technological systems and their human operators co-evolve, with the performance of each being contingent upon the design and functioning of the other (Bradshaw et al., 2013). In the context of autonomous weapons systems, the sociotechnical perspective reveals that decisions about autonomy levels cannot be divorced from the organizational, doctrinal, and cultural contexts in which those systems operate. As Bradshaw et al. (2013) argued in their influential critique of autonomous systems mythology, the "seven deadly myths" of autonomous systems stem largely from a failure to appreciate the fundamentally sociotechnical nature of human-machine interaction. Their analysis demonstrated that autonomy is not a single dimension along which systems can be linearly ranked but rather a multidimensional property that varies across different functions, contexts, and temporal scales. This insight is particularly consequential for military command and control, where the allocation of decision authority between humans and machines must be responsive to rapidly changing tactical situations, shifting rules of engagement, and evolving strategic objectives. 69 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Complex adaptive systems theory offers a further refinement particularly suited to the dynamics of military C2 environments. Unlike linear systems models that assume predictable relationships between inputs and outputs, complex adaptive systems exhibit emergent behaviors, feedback loops, and phase transitions that cannot be fully predicted from knowledge of individual components alone. The Pokorny (2026) systematic literature review explicitly recommended applying complex adaptive systems frameworks to military human-AI teams, noting that traditional human factors approaches fail to account for emergent properties of these teams operating under extreme uncertainty. This recommendation reflects a growing recognition in the field that the interaction between human cognitive processes and AI decision-making algorithms in high-stakes military environments produces behaviors that are qualitatively different from what either humans or machines would produce independently, necessitating analytical frameworks capable of capturing these emergent dynamics. The sociotechnical systems perspective also highlights the critical importance of joint optimization—the principle that technical and social subsystems must be designed in concert rather than sequentially. Historical examples from defense applications have repeatedly demonstrated that technically optimal autonomous systems can produce catastrophically suboptimal outcomes when deployed within human organizational structures that were not designed to accommodate them. The fratricide incidents involving the Patriot air defense system during Operation Iraqi Freedom in 2003, which resulted in the shoot-down of allied aircraft, illustrate how failures in the sociotechnical interface—rather than failures of either the technical system or human operators in isolation—can produce tragic consequences (Scharre, 2018). These incidents underscore the necessity of designing dynamic autonomy management frameworks that account for the full sociotechnical system, including the organizational 70 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 processes, communication structures, and cultural norms that shape how human operators interact with autonomous capabilities. Human Factors and Cognitive Engineering Human factors engineering and cognitive engineering provide the analytical tools necessary to understand how human cognitive capabilities and limitations interact with autonomous system design in military command and control environments. Rasmussen's (1983) seminal framework for human performance modeling, which distinguishes among skill-based, rule-based, and knowledge-based levels of cognitive processing, remains foundational for understanding how military operators interact with autonomous systems across different levels of engagement complexity. At the skill-based level, operators respond to familiar patterns through automated routines; at the rule-based level, they apply learned procedures to recognized situations; and at the knowledge-based level, they engage in deliberative reasoning to address novel or ambiguous circumstances. The implications for dynamic autonomy management are profound: the appropriate level of autonomy for a given situation depends critically on the cognitive processing level required of the human operator, which in turn depends on the familiarity, complexity, and time constraints of the operational context. Vicente's (1999) cognitive work analysis framework extends Rasmussen's insights by providing a systematic methodology for analyzing the cognitive demands of complex work domains and designing interfaces that support effective human performance. Cognitive work analysis emphasizes the importance of understanding the constraints that shape work activities rather than prescribing specific procedures, making it particularly well-suited to the inherently unpredictable nature of military operations. The ecological interface design approach developed by Burns and Hajdukiewicz (2004), building on Vicente's framework, advocates for displays that 71 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 make the constraints and affordances of the work domain directly visible to operators, enabling them to adapt their behavior to changing circumstances without requiring explicit procedural guidance. In the context of autonomous weapons systems, ecological interface design principles suggest that dynamic autonomy management interfaces should render the current state of humanAI authority allocation transparent and manipulable, enabling commanders to adjust autonomy levels in response to their evolving understanding of the operational situation. Endsley's (2017) comprehensive review of lessons learned from human-automation research synthesized decades of findings into principles directly applicable to autonomy design. Her analysis identified situation awareness as the critical bottleneck in human-automation interaction, demonstrating that higher levels of automation frequently degraded operators' understanding of system state and environmental conditions—the very understanding required to intervene effectively when automation failed or encountered situations beyond its competence. This finding has particularly stark implications for autonomous weapons systems, where the consequences of degraded situation awareness can include violations of international humanitarian law, friendly fire incidents, and civilian casualties. Endsley's (2018) subsequent work argued that level of autonomy forms a key aspect of autonomy design and that the field's focus on technical capability had obscured the equally important question of how autonomy levels affect human cognitive performance and decision quality. The applied cognitive task analysis methodology developed by Militello and Hutton (1998) provides practical tools for eliciting the cognitive demands faced by military operators in complex decision environments. Their toolkit, designed for practitioners working with subject matter experts, enables systematic identification of the critical decisions, judgments, and assessments that operators must make when interacting with automated systems. In the present 72 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 research context, cognitive task analysis methods are essential for understanding how military commanders currently make decisions about weapons employment and how the introduction of autonomous capabilities alters the cognitive landscape of those decisions. The cognitive engineering perspective thus complements the sociotechnical systems framework by providing micro-level analytical tools for examining the human cognitive processes that macro-level systems analyses identify as critical. Naturalistic Decision-Making The naturalistic decision-making (NDM) paradigm, pioneered by Klein (1998) and colleagues, provides an essential corrective to laboratory-based models of human decisionmaking that assume rational, deliberative processes operating under conditions of complete information and unlimited time. NDM research examines how experienced practitioners make decisions in real-world settings characterized by time pressure, uncertainty, high stakes, illdefined goals, and dynamic conditions—precisely the conditions that characterize military command and control during weapons employment. Klein's (1998) recognition-primed decision (RPD) model, the most influential product of the NDM tradition, demonstrates that expert decision-makers typically do not compare options analytically but rather recognize situations as instances of familiar patterns and select courses of action through rapid mental simulation of a single promising option. The implications of the RPD model for dynamic autonomy management are far-reaching. If experienced military commanders make weapons employment decisions primarily through pattern recognition and mental simulation rather than analytical comparison of alternatives, then autonomous systems designed to support those decisions must be compatible with these naturalistic cognitive processes rather than imposing alien analytical frameworks. Klein (2008) 73 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 elaborated on this point in his review of NDM's evolution, emphasizing that decision support systems frequently fail when they attempt to replace expert intuition with algorithmic optimization rather than augmenting and informing the expert's existing decision-making process. This insight suggests that dynamic autonomy management frameworks must be designed not to supplant the commander's recognition-primed decision process but to enhance it by providing information, options, and recommendations in formats that support rather than disrupt naturalistic cognition. The NDM perspective also illuminates the critical challenge of maintaining expert decision-making competence in increasingly automated environments. Research in the NDM tradition has consistently demonstrated that expertise is developed and maintained through experience with consequential decisions in naturalistic settings. When autonomous systems assume decision-making functions previously performed by human operators, the operators' opportunities to develop and maintain expertise are correspondingly reduced—a dynamic that Bainbridge (1983) identified in her classic treatment of the ironies of automation. In military contexts, this creates a particularly dangerous paradox: autonomous weapons systems are most needed in precisely those high-tempo, high-complexity situations that exceed normal human cognitive capacity, yet the automation that provides this capability simultaneously degrades the human expertise required to oversee that automation effectively and intervene when necessary. The NDM framework's emphasis on macrocognition—the cognitive processes that operate at the level of the overall task rather than individual cognitive operations—provides a particularly useful lens for examining how teams of humans and AI agents collaborate in command and control environments. Macrocognitive functions such as sensemaking, planning, adaptation, and coordination are distributed across team members and require shared mental 74 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 models and common ground to function effectively. As the NDM literature demonstrates, these macrocognitive processes are fundamentally different in structure and dynamics from the microcognitive processes studied in traditional cognitive psychology, and they require correspondingly different analytical approaches. The integration of AI agents into military C2 teams introduces novel macrocognitive challenges, as the AI's information processing capabilities and limitations differ qualitatively from those of human team members, requiring new forms of shared understanding and coordination protocols. Trust Theory Trust constitutes perhaps the most critical mediating variable in the relationship between autonomous system capability and effective human-AI collaboration in military contexts. The theoretical foundations of trust relevant to autonomous weapons systems span two complementary traditions: organizational trust theory and automation trust research. Mayer, Davis, and Schoorman's (1995) integrative model of organizational trust provides the foundational framework for understanding trust as a psychological state comprising the willingness to be vulnerable to the actions of another party based on the expectation that the other will perform a particular action important to the trustor, irrespective of the ability to monitor or control that other party. Their model identifies three antecedents of trust—ability, benevolence, and integrity—that have proven remarkably durable across diverse research contexts, including human-machine interaction. Lee and See's (2004) landmark review established the foundational framework for trust in automation specifically, arguing that trust in automation parallels interpersonal trust in important ways but also differs in critical respects. Their analysis identified three bases of trust in automation—performance, process, and purpose—that correspond roughly to Mayer et al.'s 75 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 ability, integrity, and benevolence dimensions but are adapted to the specific characteristics of human-machine interaction. Lee and See (2004) argued that designing for "appropriate reliance" rather than maximal trust should be the goal of automation design, as both overtrust (automation complacency) and undertrust (automation disuse) can produce dangerous outcomes. This emphasis on calibrated trust rather than maximal trust is particularly consequential for autonomous weapons systems, where overtrust could lead to unlawful engagements and undertrust could lead to failure to employ lawful and necessary defensive capabilities. Hoff and Bashir (2015) advanced the field by proposing a comprehensive three-layer model of trust in automation that integrates dispositional trust (stable individual differences in propensity to trust), situational trust (context-dependent factors including workload, risk, and organizational culture), and learned trust (dynamic trust that evolves through experience with a specific system). Their framework provided the most complete account to date of the multiple factors that simultaneously influence an operator's trust in an automated system, and it highlighted the inherently dynamic nature of trust—a characteristic particularly relevant to dynamic autonomy management, where trust levels must be continuously recalibrated as operational conditions change and system performance varies. Schaefer et al. (2016) complemented this work with a comprehensive meta-analysis of factors influencing trust development in automation, identifying 126 distinct factors organized across human-related, automation-related, and environment-related categories, with implications specifically articulated for future Army systems. The phenomenon of algorithm aversion, documented by Dietvorst et al. (2015), presents a significant challenge to trust calibration in military AI systems. Their experimental research demonstrated that people are more likely to abandon algorithmic decision aids after observing 76 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 them make errors than they are to abandon human advisors who make equivalent or even larger errors. This asymmetric response to failure has significant implications for autonomous weapons systems, where any AI-related incident—whether or not attributable to autonomous decisionmaking—could trigger widespread distrust and disuse of autonomous capabilities across military organizations. Conversely, de Visser et al. (2018) examined trust repair in human-machine interaction, finding that the strategies effective for repairing trust in machines differ from those effective for repairing interpersonal trust, with implications for how military organizations should respond to autonomous system failures to restore appropriate levels of operator trust. The cultural dimension of trust in automation has received increasing attention. Chien et al. (2014) developed an empirical model of cultural factors affecting trust in automation, finding significant differences across cultural groups in baseline trust levels, trust calibration rates, and responses to automation failures. Lyons and Guznov (2019) extended this line of research by examining individual differences in the "perfect automation schema"—the expectation that automated systems should perform flawlessly—and its relationship to trust formation and maintenance. These findings have important implications for autonomous weapons systems employed by multinational coalitions, where operators from different cultural backgrounds may exhibit systematically different trust responses to the same autonomous system behavior, potentially creating coordination challenges in joint operations. Levels of Automation Framework The levels of automation framework provides the most direct theoretical foundation for conceptualizing dynamic autonomy management. Sheridan and Verplank (1978) established the original taxonomy of automation levels in their seminal study of human and computer control of undersea teleoperators, proposing a ten-level scale ranging from fully manual control to fully 77 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 autonomous operation. Their framework, though developed for teleoperator systems, established the conceptual vocabulary that continues to dominate discourse on human-automation interaction across all domains, including military weapons systems. The enduring influence of their taxonomy lies in its recognition that automation is not a binary condition but a continuum along which different functions can be allocated to humans and machines in varying proportions. Parasuraman, Sheridan, and Wickens (2000) refined and extended this framework in their highly influential model for types and levels of human interaction with automation. Their critical contribution was the recognition that automation level is not a single dimension but must be specified separately for each of four information-processing stages: information acquisition, information analysis, decision and action selection, and action implementation. This four-stage model fundamentally reconceptualized the automation design problem by demonstrating that a system can simultaneously operate at different automation levels for different functions—a system might, for example, be highly automated in its information acquisition while requiring full human authority for action implementation. For autonomous weapons systems, this framework provides the analytical foundation for designing dynamic autonomy management schemes that allocate different levels of autonomy to different stages of the engagement decision cycle. Endsley (2017) provided an extensive review of the human-automation literature and its implications for autonomy design, identifying critical design principles for maintaining human performance in increasingly automated systems. Her analysis demonstrated that higher levels of automation, while reducing workload and potentially improving speed, frequently produced degraded situation awareness, complacency, skill degradation, and decreased ability to detect and respond to automation failures. These findings led Endsley (2018) to argue that level of 78 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 autonomy should be recognized as a key aspect of autonomy design rather than a secondary consideration subordinate to technical capability. Her work has been particularly influential in military contexts, where the consequences of automation-induced performance degradation can be catastrophic. Kaber and Endsley (2004) conducted experimental research specifically examining the effects of different levels of automation and adaptive automation on human performance, situation awareness, and workload in dynamic control tasks. Their findings demonstrated that intermediate levels of automation, combined with adaptive adjustment of automation level based on task demands, produced superior performance compared to both fully manual and highly automated conditions. These results provide direct empirical support for the concept of dynamic autonomy management—the idea that autonomy levels should be adjusted in real-time based on operational conditions rather than fixed at a single level—and suggest that the optimal approach is not to maximize automation but to match automation level to the demands of the situation. The application of levels of automation frameworks to military systems has been advanced by several specialized taxonomies. The Autonomy Levels for Unmanned Systems (ALFUS) framework developed by the National Institute of Standards and Technology (Huang, 2008) provided a standardized vocabulary for describing autonomy levels across different types of unmanned systems, incorporating three dimensions: mission complexity, environmental complexity, and human independence. Clough (2002) addressed the practical challenge of measuring and comparing autonomy levels across different unmanned aerial vehicle systems. The SAE International (2021) taxonomy for driving automation, while developed for automotive applications, has been influential in shaping thinking about levels of autonomy in military systems due to its clear delineation of the responsibilities of human operators and automated 79 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 systems at each level. Williams and Scharre (2015) applied these frameworks specifically to defense applications in their NATO-published analysis of autonomous systems issues for defense policymakers, identifying the unique challenges that military applications pose for traditional levels of automation taxonomies. Autonomous Weapons Systems: Development and Classification The development of autonomous weapons systems represents a continuum of increasing machine autonomy in the employment of lethal force, extending from the earliest mechanical triggering devices to contemporary AI-enabled systems capable of autonomous target identification, tracking, and engagement. Understanding this developmental trajectory and the classification schemes used to categorize autonomous systems is essential context for examining dynamic autonomy management, as the feasibility and design of autonomy management frameworks depend critically on the technical characteristics and operational employment patterns of the systems being managed. Historical Evolution of Autonomous Weapons The genealogy of autonomous weapons extends far deeper into military history than popular discourse typically acknowledges. Landmines and naval mines, among the earliest weapons to select and engage targets without direct human intervention at the moment of activation, have been employed for centuries and constitute a form of autonomous weapon in the most basic sense—they detect the presence of a target (through pressure, magnetic signature, or acoustic signature) and initiate an engagement without a human making a deliberate fire decision for each individual engagement (Scharre, 2018). While these early autonomous weapons operated on simple mechanical or electromechanical triggers without anything resembling 80 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 artificial intelligence, they established the fundamental conceptual and legal precedent for weapons that exercise some degree of autonomous function in the engagement decision cycle. The modern era of autonomous weapons began with the development of automated air defense systems in the latter half of the twentieth century. The Phalanx Close-In Weapon System, deployed by the U.S. Navy since 1980, represented a watershed in weapons autonomy by providing a fully automated capability to detect, track, and engage incoming anti-ship missiles without human authorization for individual engagements. The system's design reflected a recognition that the speed of incoming threats had exceeded the reaction time available for human-in-the-loop decision-making, creating an operational imperative for automated response that would recur with increasing frequency as weapons technology continued to accelerate. The Patriot air defense system, first deployed during Operation Desert Storm in 1991, provided another landmark in autonomous weapons development, with its capacity for autonomous engagement of ballistic missile threats (Singer, 2009). The Patriot's operational history, including both its successes and its fratricide incidents during Operation Iraqi Freedom, provided some of the earliest and most consequential real-world data on the challenges of managing autonomous weapons in complex operational environments (Scharre, 2018). The proliferation of remotely piloted and increasingly autonomous unmanned aerial systems from the early 2000s onward marked another inflection point. Krishnan (2009) provided an early comprehensive analysis of the technological trajectory toward fully autonomous weapons, examining both the technical feasibility and the legal and ethical implications. The evolution from remotely piloted systems like the MQ-1 Predator, which required continuous human control for all engagement decisions, to increasingly autonomous systems capable of autonomous navigation, target detection, and mission adaptation reflected a broader trend toward 81 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 distributing cognitive functions between human operators and machine systems. Boulanin and Verbruggen (2017) conducted the most comprehensive mapping of autonomy development in weapons systems to date for the Stockholm International Peace Research Institute, documenting systems across multiple countries and categorizing them by the degree of autonomy exercised in critical functions including target identification, tracking, selection, and engagement. The emergence of drone swarm technology has further accelerated the trajectory toward autonomous weapons employment. Scharre (2014) analyzed the implications of swarming tactics for military operations, arguing that the coordination requirements of large-scale swarm operations would necessitate significant autonomous decision-making by individual swarm elements. Kallenborn (2021) examined whether drone swarms represented a genuinely novel military capability or merely an incremental extension of existing unmanned systems, concluding that while individual drone capabilities might be modest, the collective capabilities of coordinated swarms represented a qualitative shift in autonomous weapons employment. Hambling (2015) provided a comprehensive technical and operational analysis of small drone proliferation and its implications for future conflict, documenting how commercially available drone technology was enabling non-state actors to acquire autonomous or semi-autonomous weapons capabilities previously available only to advanced militaries. The DARPA Offensive Swarm-Enabled Tactics (OFFSET) program (DARPA, 2017) represented the most ambitious U.S. government effort to develop and operationalize swarm tactics, seeking to enable small ground and aerial units to employ swarms of 250 or more unmanned systems in complex urban environments. 82 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Taxonomy and Classification of Autonomous Systems The classification of autonomous systems has proven to be one of the most contentious and consequential challenges in the field, as the taxonomic framework adopted directly shapes legal, ethical, and policy analysis of autonomous weapons. The most widely used classification scheme distinguishes among three categories based on the nature of human involvement in the engagement decision cycle: human-in-the-loop systems, where a human operator must authorize each engagement decision; human-on-the-loop systems, where the system can select and engage targets autonomously but a human operator monitors the process and can intervene to override or abort; and human-out-of-the-loop systems, where the system operates entirely without human supervision once activated (Sharkey, 2007, 2012). This tripartite classification, while intuitively appealing, has been criticized for oversimplifying the complex, multidimensional nature of autonomy in weapons systems. The U.S. Department of Defense Directive 3000.09, updated in 2023, adopted a somewhat different classification, distinguishing between autonomous and semi-autonomous weapons systems based on whether the system, once activated, can select and engage targets without further human input. This regulatory classification has had outsized influence on both U.S. weapons development and international governance discussions, but as Taddeo and Blanchard (2022) demonstrated in their comparative analysis of autonomous weapons definitions, there is no international consensus on the precise boundaries between autonomous and semi-autonomous systems, and different states and organizations have adopted definitions that vary in ways that have significant practical and legal consequences. Several more granular taxonomic frameworks have been proposed to capture the complexity that the tripartite classification obscures. The ALFUS framework developed by the 83 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 National Institute of Standards and Technology (Huang, 2008) proposed a three-dimensional autonomy space defined by mission complexity, environmental complexity, and human independence. Clough (2002) argued for metrics-based approaches to autonomy classification that assess specific capabilities rather than assigning systems to categorical levels. The SAE International (2021) taxonomy for driving automation, while developed for civilian autonomous vehicles, introduced the influential concept of conditional, high, and full automation levels distinguished by the nature and timing of human fallback responsibilities—a framework that has been adapted to military applications. Kim et al. (2023) analyzed manned-unmanned teaming systems specifically, proposing classification frameworks that account for the collaborative rather than independent nature of modern autonomous military systems. Williams and Scharre (2015) argued in their NATO-published analysis that the most useful taxonomic approach for defense policymakers is function-specific rather than systemlevel, assessing the degree of autonomy exercised in each critical function—target detection, identification, tracking, selection, and engagement—rather than assigning a single autonomy level to the overall system. This function-specific approach aligns naturally with the Parasuraman et al. (2000) four-stage model of automation levels and provides the most direct foundation for dynamic autonomy management frameworks, as it enables the independent adjustment of autonomy levels for different functions in response to changing operational conditions. The Hague Centre for Strategic Studies (2022) further advanced this functionspecific approach in their comprehensive study of robotic and autonomous systems in military operations, proposing design frameworks that explicitly account for the variability of autonomy requirements across different operational phases and contexts. 84 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Sharma et al. (2024) examined the regulatory implications of different taxonomic approaches, arguing that legal governance frameworks for military AI-driven autonomy must be sufficiently flexible to accommodate the function-specific and context-dependent nature of autonomy in modern weapons systems. Their analysis highlighted the inadequacy of categorical autonomy classifications for regulatory purposes and advocated for governance approaches that specify requirements for each critical function rather than for the system as a whole. Van der Velde et al. (2021) provided a complementary perspective, assessing the military applicability of various robotic and autonomous systems classification frameworks and identifying the characteristics most relevant to operational employment decisions. Current Autonomous and Semi-Autonomous Weapons Programs Contemporary autonomous and semi-autonomous weapons programs span the full spectrum of military domains and illustrate both the current state of the art and the trajectory of future development. The Aegis Combat System, employed on U.S. Navy cruisers and destroyers, represents one of the most sophisticated operational autonomous weapons systems, integrating radar, weapons control, and engagement decision-making into a system capable of autonomous response to multiple simultaneous threats. Operating in its auto-special mode, the Aegis system can detect, track, and engage incoming anti-ship missiles and aircraft without individual human authorization for each engagement—a capability deemed necessary by the speed of modern antiship missile threats that can close to engagement range in seconds (Scharre, 2018). Israel's Iron Dome air defense system provides another prominent example of a system operating with significant autonomous capability. Designed to intercept short-range rockets and artillery shells, Iron Dome employs autonomous threat classification and engagement decisionmaking, with human operators providing supervisory oversight rather than authorizing individual 85 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 engagements. The system's operational employment in multiple conflicts since its deployment in 2011 has provided some of the most extensive real-world data on autonomous weapons performance, though detailed analysis of its decision-making processes and human oversight practices remains largely classified. In the realm of offensive autonomous weapons, the Long Range Anti-Ship Missile (LRASM) represents a significant advance in autonomous targeting capability. Designed to autonomously detect, classify, and engage maritime targets in contested electromagnetic environments where GPS and data-link communications may be denied, LRASM embodies the operational logic driving autonomous weapons development: the need to maintain weapons effectiveness in environments where continuous human control is technically infeasible (Sayler, 2023). The Collaborative Combat Aircraft (CCA) program, formerly known as Loyal Wingman, represents the U.S. Air Force's most ambitious effort to develop autonomous combat aircraft designed to operate alongside manned fighters, with the autonomous aircraft performing roles ranging from sensor extension to weapons delivery under the supervision of the manned aircraft's pilot. The MQ-25 Stingray represents an adjacent development—an unmanned carrier-based aerial refueling system (Sayler, 2024) that, while not a weapons system per se, advances the integration of autonomous systems into carrier air wing operations and establishes operational patterns for manned-unmanned teaming in naval aviation. Swarming technology represents the frontier of autonomous weapons development. The DARPA OFFSET program (DARPA, 2017) aimed to develop tactics, techniques, and procedures enabling small military units to employ swarms of over 250 unmanned systems in urban environments, with individual swarm elements making autonomous decisions about navigation, target identification, and coordination. The operational implications of swarm technology for 86 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 dynamic autonomy management are profound: the sheer number of autonomous agents in a swarm makes individual human oversight of each agent's decisions impractical, requiring new paradigms for human supervisory control that operate at the collective rather than individual level (Scharre, 2014). The increasing availability of commercial drone technology has also enabled non-state actors to develop improvised autonomous capabilities, as documented by Rossiter (2018), creating new challenges for counter-autonomy operations. Military Robotics and Unmanned Systems The broader landscape of military robotics and unmanned systems provides essential context for understanding autonomous weapons development, as advances in unmanned system technology in non-weapons roles frequently migrate to weapons applications. The U.S. Department of Defense's Unmanned Systems Integrated Roadmap (U.S. Department of Defense, 2011) established the strategic vision for unmanned systems development across all military domains—air, ground, and maritime—and articulated the anticipated trajectory from remotely operated systems toward increasing autonomous capability. Sayler (2023) provided a comprehensive overview of the current state of U.S. unmanned systems programs, documenting the accelerating pace of development and the expanding range of missions assigned to unmanned platforms. Human-robot interaction in military operations has been the subject of extensive research. Barnes and Jentsch (2017) edited a comprehensive volume examining human-robot interaction issues in future military operations, addressing topics including operator workload, trust, communication, and team dynamics. Chen and Barnes (2014) reviewed human-agent teaming for multirobot control, identifying key human factors issues including the cognitive demands of supervising multiple autonomous systems simultaneously, the challenge of maintaining situation 87 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 awareness across multiple robotic agents, and the design requirements for effective human-robot team interfaces. Their finding that effective multirobot control requires fundamentally different interface designs than single-robot control has significant implications for dynamic autonomy management in multi-agent military systems. Ground robotics have evolved from simple teleoperated explosive ordnance disposal robots to increasingly autonomous systems capable of independent navigation, obstacle avoidance, and limited decision-making in complex terrain. Feickert (2018) analyzed the implications of ground robotics and autonomous systems for military operations and congressional oversight, documenting the gap between technical capability and the policy frameworks governing autonomous system employment. Naval autonomous systems, including unmanned surface vessels and unmanned undersea vehicles, have expanded the domain of autonomous military operations to the maritime environment, introducing unique challenges related to communication in the underwater environment, extended autonomous operation beyond communication range, and the legal implications of unmanned vessels operating in international waters. The concept of manned-unmanned teaming (MUM-T) has emerged as the dominant operational paradigm for integrating unmanned systems into military operations. Kim et al. (2023) analyzed MUM-T system development at the program level, documenting the technical and operational challenges of establishing effective collaboration between manned and unmanned platforms. The MUM-T concept is directly relevant to dynamic autonomy management because it requires real-time adjustment of the autonomy level granted to unmanned systems based on the operational situation, the communication environment, and the manned platform operator's cognitive capacity and situational understanding. Mayer (2015) examined the 88 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 innovation trajectory of military drone technology, providing insights into how technological development interacts with operational requirements and organizational culture to shape the pace and direction of autonomous capability development. Counter-Autonomy and Adversarial Considerations The development of autonomous weapons systems has inevitably spawned a parallel field of counter-autonomy research and adversarial analysis. As autonomous systems assume greater roles in military operations, adversaries develop tactics, techniques, and procedures designed to exploit the vulnerabilities inherent in autonomous decision-making—a dynamic that has significant implications for dynamic autonomy management frameworks. Altmann and Sauer (2017) examined the strategic stability implications of autonomous weapons systems, arguing that the interaction between competing autonomous systems could produce dangerous escalation dynamics that neither side intends or can control. Their analysis highlighted the risk of "flash wars"—rapid escalatory spirals driven by autonomous system interactions operating faster than human decision-making—as a particularly consequential threat to strategic stability. Garcia (2018) analyzed how lethal artificial intelligence could reshape the future of international peace and security, identifying several pathways through which autonomous weapons could destabilize existing security arrangements. Johnson (2019) extended this analysis to examine the broader implications of artificial intelligence for international security and future warfare, arguing that AI-enabled military capabilities would fundamentally alter the character of conflict in ways that existing strategic frameworks are poorly equipped to address. The adversarial dimension of autonomous weapons employment underscores the necessity of dynamic autonomy management frameworks that can adapt not only to friendly operational conditions but also to adversary tactics specifically designed to exploit the vulnerabilities of 89 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 autonomous systems, including spoofing, jamming, algorithmic manipulation, and adversarial machine learning attacks. The DARPA Assured Autonomy program (DARPA, 2019) represented a significant effort to address the verification and validation challenges that adversarial threats pose to autonomous military systems. The program sought to develop methods for providing mathematical guarantees of autonomous system behavior within specified operational parameters—a capability essential for maintaining confidence in autonomous systems operating in adversarial environments. Holland Michel (2020) examined the related challenge of predictability and understandability in military AI, arguing that the inability to fully predict and understand AI system behavior constitutes a fundamental obstacle to meaningful human control in military applications. These technical challenges reinforce the argument for dynamic autonomy management frameworks that can increase human involvement in the decision process when autonomous system reliability is uncertain or when adversarial interference is suspected. Human-AI Teaming and Collaboration The integration of autonomous systems into military operations fundamentally reconstitutes the nature of command and control by introducing artificial agents as functional team members rather than mere tools. This section examines the theoretical and empirical literature on human-AI teaming, with particular attention to the trust dynamics that mediate the effectiveness of human-AI collaboration in military contexts. The literature reveals a field in rapid transition from conceptualizing AI as a tool to be used by human operators toward understanding AI as a teammate whose cognitive capabilities, limitations, and social dynamics must be managed through deliberate teaming practices. 90 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Foundations of Human-AI Teaming The theoretical foundations of human-AI teaming draw on established research in team cognition, organizational psychology, and human-robot interaction, while extending these traditions to accommodate the unique characteristics of artificial agents. Johnson et al. (2014) proposed the coactive design framework for human-robot interaction, grounding effective human-machine collaboration in the concept of interdependence—the mutual reliance of team members on each other's activities and contributions. Their framework identifies three types of interdependence—observability, predictability, and directability—as the essential requirements for effective human-machine teaming, arguing that system designers must ensure that AI agents are observable (their actions and states can be perceived by human teammates), predictable (their behavior can be anticipated based on knowledge of their capabilities and the situation), and directable (their behavior can be modified by human teammates when necessary). These three properties map directly onto the requirements for dynamic autonomy management, which demands that human commanders be able to observe, predict, and direct the behavior of autonomous weapons systems. Cannon-Bowers et al. (1993) established the foundational theory of shared mental models in expert team decision-making, demonstrating that effective team performance depends on team members sharing compatible knowledge structures about the task, the team, the equipment, and the situation. The challenge of extending shared mental model theory to human-AI teams is substantial, as AI systems do not form mental models in the same cognitive sense as human team members. Demir et al. (2017) addressed this challenge in their research on team situation awareness within the context of human-autonomy teaming, demonstrating that effective humanAI teams develop functional analogs to shared mental models through structured communication 91 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 protocols and transparent system behavior. McNeese et al. (2018) provided complementary findings from their research on teaming with synthetic teammates, showing that human team members engaged in qualitatively different coordination behaviors when teaming with AI agents compared to human teammates, with implications for both team training and interface design. Cooke et al. (2013) advanced the theoretical understanding of team cognition through their interactive team cognition framework, which conceptualizes team cognition not as a property residing in individual team members' heads but as an emergent property of the interactions among team members. This perspective is particularly useful for understanding human-AI teams, where the cognitive processes are not merely aggregated across human and artificial agents but emerge from the dynamic interactions between them. Fiore and Wiltshire (2016) extended this perspective by examining the role of technology as a teammate in supporting team cognitive processes, arguing that AI agents function as cognitive extensions of human team members rather than independent cognitive entities. Their framework of external cognition provides theoretical grounding for understanding how autonomous systems can support distributed cognitive processes in military command and control. O'Neill et al. (2022) conducted a comprehensive review and analysis of the empirical literature on human-autonomy teaming, synthesizing findings across multiple research domains to identify the factors that most strongly influence human-autonomy team performance. Their analysis revealed that the most effective human-autonomy teams were characterized by dynamic task allocation (the ability to redistribute functions between humans and autonomous agents in response to changing conditions), transparent system behavior (the ability of human teammates to understand the autonomous agent's decision processes), and calibrated trust (an accurate assessment of the autonomous agent's capabilities and limitations). Bradshaw et al. (2013) 92 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 provided a complementary analysis by identifying and debunking seven prevalent myths about autonomous systems that impede effective human-AI teaming, including the myth that full autonomy is achievable or desirable, the myth that the autonomy levels framework is sufficient for guiding system design, and the myth that machines will simply replace humans in complex tasks. Trust in AI and Autonomous Systems Trust calibration—the degree to which an operator's trust in an autonomous system matches the system's actual capabilities and reliability—represents the central challenge for effective human-AI collaboration in military operations. Hancock et al. (2011) conducted a metaanalysis of factors affecting trust in human-robot interaction, finding that robot-related factors (particularly performance and reliability) had the strongest influence on trust, followed by environmental factors (including task complexity and the consequences of errors), with humanrelated factors (such as personality traits and prior experience) exerting the weakest influence. This finding suggests that trust calibration efforts should focus primarily on ensuring that autonomous system behavior accurately communicates the system's actual capabilities and limitations to human operators. The construct of automation complacency—the tendency for human operators to become overly reliant on automated systems and to reduce their monitoring and cross-checking of automated outputs—has been extensively documented as a threat to effective human-automation interaction. Parasuraman and Manzey (2010) provided a comprehensive review of complacency and bias in human use of automation, proposing an attentional integration framework that explains complacency as a consequence of attentional resource allocation rather than a generalized attitude of overreliance. Their framework predicts that complacency is most likely 93 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 under conditions of high workload, where operators lack the attentional resources to maintain active monitoring of automated systems, and in situations where the automated system has a history of reliable performance that reinforces the expectation of continued reliability. In military contexts, these conditions are frequently present simultaneously: operators face high cognitive demands from multiple concurrent tasks, and autonomous systems typically demonstrate high reliability during routine operations, with failures occurring precisely when conditions deviate from the system's training distribution. Dzindolet et al. (2003) examined the role of trust in automation reliance through a series of experimental studies that demonstrated the complex relationship between trust, system reliability, and reliance behavior. Their findings revealed that operators' reliance on automated systems was mediated not only by their trust in the system but also by their understanding of the reasons for the system's recommendations—a finding that anticipated the subsequent emphasis on explainable AI by more than a decade. Jian et al. (2000) developed one of the most widely used empirically validated scales for measuring trust in automated systems, providing a foundation for quantitative trust assessment that has been applied across diverse automation contexts including military applications. Madhavan and Wiegmann (2007) provided an integrative review of the similarities and differences between human-human and human-automation trust, identifying key dimensions along which trust in machines diverges from interpersonal trust. Their analysis highlighted the tendency for humans to hold machines to higher reliability standards than human collaborators, a finding consistent with Dietvorst et al.'s (2015) algorithm aversion research. Wickens and Dixon (2007) synthesized the literature on the benefits of imperfect diagnostic automation, demonstrating that even unreliable automation can improve human performance if operators can 94 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 learn to calibrate their trust appropriately—a finding that underscores the importance of trust calibration mechanisms in dynamic autonomy management frameworks. Okamura and Yamada (2020) advanced the field by developing an adaptive trust calibration mechanism for human-AI collaboration, demonstrating that real-time adjustment of system behavior based on estimated operator trust levels could improve trust calibration and overall human-AI team performance. Trust in Military AI Contexts The application of trust theory to military AI contexts introduces unique considerations related to the high-stakes nature of military decisions, the hierarchical organizational structure of military institutions, and the cultural factors that shape military attitudes toward technology. Lyons et al. (2016) examined the engineering of trust in complex automated systems with specific attention to military applications, arguing that trust engineering must be incorporated into the system design process from the outset rather than addressed as an afterthought. Their framework identified specific design features—including transparent decision processes, predictable behavior, and clear communication of system confidence levels—that promote appropriate trust calibration in military operators. Schaefer et al.'s (2016) meta-analysis of factors influencing trust development in automation, conducted with specific attention to implications for future Army systems, identified a comprehensive taxonomy of trust-relevant factors organized into human, automation, and environmental categories. Their analysis revealed that the factors most strongly influencing trust in military contexts included the autonomous system's reliability, the operator's understanding of the system's decision processes, the perceived consequences of errors, and the degree of organizational support for autonomous system employment. De Visser et al. (2018) examined trust repair in the context of increasing autonomy in human-machine systems, finding that 95 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 effective trust repair strategies for autonomous systems differ from those effective for less autonomous automation, with more autonomous systems requiring more elaborate and transparent repair processes to restore operator trust after failures. The organizational and cultural dimensions of trust in military AI have received increasing attention. Research by Chien et al. (2014) demonstrated that cultural background significantly influences trust formation and calibration in human-automation interaction, with implications for multinational military operations employing shared autonomous systems. Lyons and Guznov (2019) examined individual differences in trust in human-machine interaction across multiple studies, identifying the perfect automation schema—the expectation that automated systems should perform without errors—as a significant predictor of trust formation patterns. Their work revealed that military personnel with strong perfect automation schemas exhibited more extreme trust responses to automation failures, suggesting that trust management interventions for military AI systems must account for individual differences in automation expectations. The National Academies of Sciences, Engineering, and Medicine (2021) assessment of human-AI teaming identified trust as one of the most critical research needs for military applications, noting that existing trust frameworks developed in civilian contexts may not fully capture the unique dynamics of trust in military environments characterized by extreme stakes, time pressure, and organizational hierarchy. The assessment called for dedicated research on trust formation, maintenance, and repair in military human-AI teams, with particular attention to the effects of operational experience, training, and organizational culture. This call reflects a growing recognition that trust in military AI is not merely a scaled-up version of trust in civilian 96 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 automation but involves qualitatively different dynamics related to the lethal consequences of trust miscalibration and the institutional culture of military organizations. Human-Robot Interaction in Military Operations Human-robot interaction (HRI) in military operations presents distinct challenges that differentiate it from civilian HRI contexts. Barnes and Jentsch (2017) provided a comprehensive examination of HRI issues in future military operations, documenting the cognitive demands placed on military operators who must supervise autonomous systems while simultaneously managing their own tactical responsibilities in high-threat environments. The dual-task nature of military HRI—where the operator must both direct the autonomous system and maintain awareness of the broader tactical situation—creates workload dynamics that are qualitatively different from civilian applications where the operator's primary task is typically the supervision of the autonomous system itself. Chen and Barnes (2014) identified specific human factors challenges in multirobot control that are particularly acute in military applications, including the problem of attention allocation across multiple autonomous agents with varying levels of autonomy and urgency, the challenge of maintaining coherent situation awareness when information is distributed across multiple robotic platforms, and the design of interfaces that enable effective supervisory control without overwhelming the operator with information. Their research demonstrated that the optimal ratio of autonomous systems to human supervisors depends not only on the autonomous systems' level of autonomy but also on the operational tempo, threat environment, and communication reliability—all factors that vary dynamically during military operations and that dynamic autonomy management frameworks must accommodate. 97 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The emerging paradigm of manned-unmanned teaming introduces additional HRI challenges specific to the collaborative employment of manned and unmanned military platforms. In MUM-T operations, the human operator in the manned platform must manage their own tactical situation while simultaneously directing the actions of one or more unmanned wingmen, a cognitive demand that increases rapidly with the number and complexity of unmanned systems under supervision. The Collaborative Combat Aircraft program and similar initiatives worldwide are generating new research requirements for HRI design that balances the operational benefits of unmanned wingmen against the cognitive costs imposed on the manned platform's crew. The Work and Scharre (2015) analysis of autonomy's role in the third offset strategy identified MUM-T as a critical capability that would require significant advances in human-robot interaction design to realize its operational potential. Team Performance and Effectiveness with AI Agents Research on team performance with AI agents has revealed both the potential benefits and the characteristic challenges of integrating artificial agents into human teams. CannonBowers et al. (1993) established the foundational link between shared mental models and team performance, demonstrating that teams with more accurate and complete shared mental models exhibited superior coordination, communication, and task performance. Extending this framework to human-AI teams, O'Neill et al. (2022) found that the effectiveness of humanautonomy teams depends critically on the degree to which human team members develop accurate mental models of the autonomous agent's capabilities, limitations, and decision-making processes. When human team members held inaccurate mental models of their AI teammates, team performance suffered regardless of the AI's actual capability level. 98 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Fiore and Wiltshire (2016) examined how technology functions as a teammate in supporting team cognitive processes, arguing that AI agents can enhance team performance by serving as external cognitive resources that extend the team's information processing capacity. However, their analysis also identified the risk that excessive reliance on AI cognitive support could degrade the team's organic cognitive capabilities over time, mirroring Bainbridge's (1983) concerns about the ironies of automation at the team level. Cooke et al. (2013) provided theoretical foundations for understanding how team cognition emerges from interactions among team members, including interactions between human and artificial agents, suggesting that effective human-AI teams must be designed not merely to aggregate the capabilities of human and artificial agents but to cultivate emergent cognitive properties that arise from their interaction. The measurement and evaluation of human-AI team performance presents significant methodological challenges. Traditional team performance metrics, developed for all-human teams, may not adequately capture the unique dynamics of human-AI collaboration. The Pokorny (2026) systematic review noted the absence of standardized evaluation methodologies for military human-AI teams that integrate operational metrics, trust measures, ethical compliance, and mission effectiveness into a unified assessment framework. This measurement gap has practical consequences for dynamic autonomy management: without valid and reliable metrics for assessing human-AI team performance across different autonomy configurations, it is difficult to determine which autonomy allocation strategies produce the best outcomes and under what conditions. 99 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Dynamic Autonomy and Adaptive Control The concept of dynamic autonomy—the adjustment of the degree of autonomous authority granted to a system based on changing conditions—represents the central focus of this dissertation and a critical frontier in human-automation interaction research. This section examines the theoretical foundations and empirical findings on dynamic and adaptive autonomy approaches, mixed-initiative systems, function allocation methods, the persistent challenges of automation's ironies, and the specific application of these concepts to military systems. Concepts of Dynamic and Sliding Autonomy The concept of adjustable autonomy was formally articulated by Dorais et al. (1999) in their seminal work on human-centered autonomous systems for space applications. Their framework recognized that fixed autonomy levels—whether high or low—are suboptimal for complex, dynamic environments, and proposed that autonomy should be adjustable along multiple dimensions in response to changing mission requirements, environmental conditions, and human operator state. Dorais et al. identified three key challenges for adjustable autonomy systems: determining when to adjust the level of autonomy, determining which functions should be reallocated between human and machine, and ensuring safe transitions between autonomy levels. These challenges remain at the core of dynamic autonomy management research more than two decades later. Goodrich et al. (2001) conducted early experimental work on adjustable autonomy, demonstrating that operator performance and satisfaction varied significantly across different autonomy configurations and that the optimal autonomy level depended on the specific characteristics of the task and environment. Their experiments revealed that operators generally preferred intermediate autonomy levels that balanced the workload reduction benefits of 100 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 automation against the situation awareness and control benefits of manual operation. Crandall and Goodrich (2002) extended this work by developing methods for characterizing the efficiency of human-robot interaction under different autonomy regimes, providing quantitative tools for comparing the performance of different autonomy allocation strategies. The concept of sliding autonomy, introduced by Sellner et al. (2006), extended the adjustable autonomy framework to large-scale multiagent systems, addressing the coordination challenges that arise when autonomy levels must be managed across multiple autonomous agents simultaneously. Their work demonstrated that sliding autonomy—the continuous, gradual adjustment of autonomy levels rather than discrete switches between fixed levels—produced superior performance in complex assembly tasks by enabling the system to adapt smoothly to changing conditions. Scerri et al. (2002) addressed the specific challenge of adjustable autonomy in real-world multiagent systems, developing computational frameworks for determining when and how to transfer control between human operators and autonomous agents based on the current state of the task, the environment, and the agents' estimated competence. Feigh et al. (2012) provided a comprehensive framework for characterizing adaptive systems that adjusted their behavior in response to changes in the operational context or the human operator's state. Their taxonomy of adaptive automation approaches distinguished among several dimensions of adaptation, including what aspects of the system are adapted, what triggers the adaptation, who initiates the adaptation, and how the adaptation is implemented. This framework is directly applicable to dynamic autonomy management in military contexts, where the decision about who initiates autonomy level changes—the human operator, the autonomous system, or a negotiated process—has significant implications for both operational effectiveness and meaningful human control. 101 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Mixed-Initiative Systems and Playbook Approaches Mixed-initiative interaction, in which both human operators and autonomous systems can initiate changes in task allocation and control authority, represents a sophisticated approach to dynamic autonomy management that avoids the limitations of purely human-initiated or purely system-initiated autonomy adjustment. Miller and Parasuraman (2007) proposed the concept of delegation interfaces for supervisory control, in which human operators interact with autonomous systems through interfaces designed to support flexible delegation of tasks rather than moment-by-moment control of system behavior. Their approach was grounded in the insight that effective supervisory control of autonomous systems requires interfaces that enable operators to communicate their intentions, constraints, and priorities to the autonomous system at an appropriate level of abstraction, rather than specifying detailed behavioral commands. Shively et al. (2018) advanced the concept of playbook-based approaches to humanautonomy teaming, in which the human operator selects from a predefined set of plays that specify the allocation of functions between human and autonomous agents for a given tactical situation. The playbook metaphor, drawn from team sports, captures the idea that effective human-autonomy teaming requires a shared repertoire of coordinated action patterns that can be invoked rapidly in response to changing conditions. This approach offers significant advantages for military applications, where the speed of tactical decision-making may not allow for deliberative negotiation of autonomy levels between human operators and autonomous systems. By pre-specifying autonomy allocation patterns for anticipated tactical situations, the playbook approach enables rapid reconfiguration of human-machine roles without the cognitive overhead of real-time autonomy negotiation. 102 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The mixed-initiative paradigm introduces specific challenges related to the management of initiative—the question of when the autonomous system should proactively adjust its behavior or request changes in autonomy allocation, and when it should defer to the human operator's current allocation decisions. Scerri et al. (2002) addressed this challenge in the context of multiagent systems, developing algorithms that balance the benefits of autonomous initiative (faster response to changing conditions, reduced demands on human attention) against the costs (potential disruption of the human's current plan, reduced predictability, and the possibility of autonomy allocation decisions that the human would not endorse). The balance between autonomous initiative and human directability is particularly consequential in weapons employment contexts, where autonomous system initiative in adjusting engagement parameters could have lethal consequences if not properly constrained. Function Allocation Methods The allocation of functions between human operators and automated systems has been a central concern of human factors engineering since the dawn of automation. Fitts (1951) proposed the original "MABA-MABA" (Men Are Better At/Machines Are Better At) list, which allocated functions between humans and machines based on their respective strengths: humans were judged superior at perception, learning, inductive reasoning, and creative problem-solving, while machines excelled at speed, computation, repetitive tasks, and force application. While the Fitts list has been enormously influential, it has also been extensively criticized for its static, binary approach to function allocation, which fails to account for the context-dependent nature of human and machine performance and the possibility of dynamic reallocation (Feigh et al., 2012). Modern approaches to function allocation have moved beyond the Fitts list toward more dynamic, context-sensitive frameworks. Parasuraman et al. (2000) argued that function 103 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 allocation should be considered not as a binary assignment of entire functions to either humans or machines but as a continuous allocation of each information-processing stage to the appropriate level of automation, with the optimal allocation varying based on task demands, environmental conditions, and human operator state. Crandall and Goodrich (2002) developed quantitative methods for evaluating the efficiency of different function allocation strategies in human-robot interaction, providing tools for empirically comparing alternative allocation approaches. Feigh et al. (2012) proposed a comprehensive characterization framework for adaptive systems that allocate functions dynamically, identifying the key design parameters that determine the effectiveness of adaptive function allocation. The challenge of function allocation is particularly acute for autonomous weapons systems, where the allocation of engagement-critical functions has direct implications for compliance with international humanitarian law. The Parasuraman et al. (2000) framework is particularly useful in this context because it enables granular analysis of which specific functions in the engagement decision cycle—target detection, identification, classification, selection, and engagement authorization—should be allocated to human or machine at each level of automation. This function-specific approach to autonomy allocation provides the foundation for dynamic autonomy management frameworks that can adjust the human-machine allocation for each critical function independently, enabling configurations that, for example, grant high automation to target detection while maintaining full human authority for engagement authorization. Ironies of Automation and Out-of-the-Loop Problems Bainbridge's (1983) classic paper on the ironies of automation identified a set of paradoxes that continue to challenge automation design more than four decades later. The central 104 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 irony is that the designers of automated systems, by automating the easy tasks and leaving the difficult tasks to human operators, create a situation in which the humans responsible for supervising the automation are those least prepared by their routine experience to handle the exceptional situations that the automation was not designed to manage. A second irony arises from the fact that the more reliable the automation becomes, the less practice human operators have in performing the automated functions manually, so that when the automation does fail, the operators' manual skills have atrophied precisely when they are most needed. Endsley and Kiris (1995) provided empirical evidence for the out-of-the-loop performance problem, demonstrating that operators of automated systems experienced significant degradation in their ability to detect system failures and to take over manual control when automation failed. Their research showed that higher levels of automation were associated with greater degradation of operator performance when manual intervention was required, creating a dangerous paradox for system designers: the more authority granted to the automation, the less capable the human operator becomes at exercising the oversight function that justifies granting that authority. Parasuraman and Manzey (2010) documented the related phenomenon of automation complacency, providing a comprehensive theoretical account of the attentional mechanisms through which high-reliability automation induces reduced monitoring and crosschecking by human operators. The implications of automation ironies for autonomous weapons systems are particularly stark. If the automation ironies apply to autonomous weapons—and there is no reason to believe they would not—then increasing the level of autonomy in weapons systems will progressively degrade the ability of human operators to maintain the situation awareness and intervention capability required for meaningful human control. This creates a fundamental tension in 105 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 autonomous weapons design: the very capability that autonomous systems provide (rapid, highvolume decision-making beyond human cognitive capacity) simultaneously undermines the human oversight that legal and ethical frameworks demand. Dynamic autonomy management offers a potential resolution to this tension by maintaining human operators in an active decisionmaking role for at least some functions at all times, thereby preventing the complete disengagement that produces the most severe out-of-the-loop performance problems. Kaber and Endsley (2004) demonstrated that adaptive automation—systems that automatically adjust the level of automation based on task demands and operator state—can mitigate the out-of-the-loop performance problem by keeping operators engaged in the decision process even during periods of high automation. Their experimental findings showed that adaptive automation produced better situation awareness and faster takeover performance than static high-level automation, while still providing workload reduction during low-demand periods. These results provide direct empirical support for dynamic autonomy management approaches that adjust autonomy levels in real-time rather than fixing them at a single level, suggesting that the ironies of automation can be at least partially addressed through thoughtful autonomy management design. Context-Dependent Autonomy Allocation in Military Systems The application of dynamic autonomy concepts to military systems introduces contextspecific requirements that distinguish military autonomy allocation from its civilian counterparts. Military operations are characterized by contested and degraded communication environments in which continuous human oversight may be technically infeasible, adversarial threats that may exploit predictable autonomy allocation patterns, rapidly changing rules of engagement that may require immediate reconfiguration of human-machine authority allocation, and the ultimate 106 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 possibility that autonomous system actions may result in the loss of human life. These characteristics create unique design requirements for dynamic autonomy management frameworks that civilian applications do not fully share. The U.S. Air Force's Autonomous Horizons report (U.S. Air Force, 2015) articulated a vision for increasing autonomous capability in Air Force systems, recognizing that the path toward greater autonomy would require new approaches to managing the human-machine relationship. The report emphasized that autonomous capability should be viewed not as a replacement for human decision-making but as an expansion of the decision space available to human commanders, enabling them to operate more effectively across a wider range of conditions. The Hague Centre for Strategic Studies (2022) provided a complementary analysis of the design, development, and employment of robotic and autonomous systems in military operations, identifying the operational factors that should drive autonomy allocation decisions including mission criticality, threat level, communication reliability, and the available time for human decision-making. The concept of context-dependent autonomy allocation is directly reflected in current U.S. military doctrine and policy. DoD Directive 3000.09 (U.S. Department of Defense, 2023) implicitly endorses a context-dependent approach by specifying different approval authorities and oversight requirements for autonomous weapons employment based on the nature of the targets, the operational environment, and the type of autonomous functions involved. Similarly, the NATO principles of responsible use for artificial intelligence in defence (NATO, 2024) emphasize that the appropriate level of human oversight for AI systems should be determined based on the specific context of employment, including the potential consequences of AI system actions and the availability of human decision-makers. These policy frameworks provide the 107 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 institutional context within which dynamic autonomy management frameworks for military applications must operate and the requirements they must satisfy. Command and Control in the Age of AI The integration of artificial intelligence into military command and control represents a transformation of the fundamental processes through which military forces are directed and coordinated. This section examines the theoretical foundations of C2, the evolution from network-centric warfare to AI-enabled C2 concepts, the relationship between mission command philosophy and AI integration, the emerging Joint All-Domain Command and Control architecture, and the specific challenges and opportunities of AI decision support in military C2 environments. Classical C2 Theory John Boyd's OODA loop—Observe, Orient, Decide, Act—has served as the dominant conceptual framework for understanding military command and control since its introduction in the 1970s. Boyd (1996) argued that competitive advantage in conflict accrues to the party that can complete the OODA cycle faster and more accurately than the adversary, disrupting the adversary's decision cycle while maintaining the coherence of one's own. The OODA loop framework has been enormously influential in military thinking about C2, but it has also been criticized for implying a linear, sequential decision process that may not accurately represent the parallel, iterative nature of actual military decision-making, particularly in complex multidomain operations. Brehmer (2005) proposed a dynamic OODA loop that amalgamated Boyd's framework with cybernetic approaches to command and control, addressing the limitations of the original linear model. Brehmer's dynamic model recognized that the OODA loop operates as a 108 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 continuous feedback process rather than a discrete sequential cycle, with observations continuously informing orientation, orientation shaping the interpretation of new observations, and actions generating new observations that initiate subsequent cycles. This dynamic perspective is particularly relevant to the integration of AI into C2, as AI systems can accelerate and augment specific phases of the OODA loop—particularly the Observe and Orient phases through enhanced sensor fusion and pattern recognition—while potentially creating new challenges for the Decide phase if the increased speed of information processing outpaces human cognitive capacity for deliberative decision-making. Alberts and Hayes (2003, 2006) made foundational contributions to C2 theory through their work on power to the edge and the understanding of command and control. Their analysis argued that traditional hierarchical C2 structures, while appropriate for industrial-age warfare, were increasingly inadequate for the complexity, speed, and distributed nature of informationage conflict. They proposed that effective C2 in the information age required distributing decision-making authority to the edge of the organization—to the personnel closest to the point of action—while maintaining the information connectivity necessary for coordinated action. This edge-based C2 philosophy anticipated many of the opportunities and challenges that AI integration into C2 now presents, as AI can serve as the connective tissue that enables distributed decision-making while maintaining the shared awareness necessary for organizational coherence. Network-Centric Warfare and C2 Agility The network-centric warfare concept, articulated by Cebrowski and Garstka (1998), proposed that networking military forces and platforms through robust information networks would generate increased combat power through improved shared awareness, increased speed of command, higher tempo of operations, greater lethality, increased survivability, and a degree of 109 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 self-synchronization. The concept rested on three tenets: that a robustly networked force improves information sharing, that information sharing enhances the quality of information and shared situational awareness, and that shared situational awareness enables collaboration and self-synchronization that improve sustainability and speed of command. While network-centric warfare was developed before the current generation of AI capabilities, its emphasis on information superiority and shared awareness as the foundations of military advantage provides the intellectual context for contemporary efforts to integrate AI into C2. The NATO C2 Agility concept, developed through the SAS-085 study (NATO STO, 2014), extended network-centric warfare thinking by recognizing that effective C2 requires not only information superiority but also the organizational agility to reconfigure C2 structures in response to changing operational requirements. The NATO NEC C2 Maturity Model identified a spectrum of C2 approaches ranging from conflicted (uncoordinated independent action) through de-conflicted, coordinated, and collaborative to edge (fully distributed self-synchronization), and argued that effective C2 requires the ability to move among these approaches as the situation demands. This C2 agility concept maps directly onto the requirements for dynamic autonomy management, as the appropriate allocation of decision authority between human commanders and autonomous systems may need to shift along a comparable spectrum in response to changing operational conditions. The transition from network-centric warfare to AI-enabled C2 represents a qualitative shift in the nature of the information advantage that C2 seeks to exploit. Where network-centric warfare focused on connecting human decision-makers to enable faster and better-informed decisions, AI-enabled C2 introduces artificial agents that can process information, generate courses of action, and in some cases execute decisions at speeds and scales beyond human 110 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 cognitive capacity. The Special Competitive Studies Project (2024) analyzed the implications of this transition for military C2, concluding that reimagining C2 in the age of AI requires fundamental changes to organizational structures, decision processes, and human-machine relationships. The Joint Air Power Competence Centre (2021) reached similar conclusions in its assessment of AI's potential impact on C2 systems, emphasizing that AI integration would require new doctrinal frameworks for authority delegation and new approaches to humanmachine trust and oversight. Mission Command Philosophy and AI Integration The U.S. Army's mission command philosophy, codified in ADP 6-0 (U.S. Army, 2019), emphasizes the exercise of authority and direction by commanders using mission-type orders to enable disciplined initiative within the commander's intent. Mission command is predicated on the principle that the complexity and uncertainty of military operations require empowering subordinate commanders with sufficient authority and information to make decisions and adapt to changing conditions without waiting for detailed instructions from higher headquarters. The philosophy rests on seven principles: competence, mutual trust, shared understanding, commander's intent, mission orders, disciplined initiative, and risk acceptance. The integration of AI into C2 creates both opportunities and tensions with mission command philosophy. On one hand, AI capabilities can enhance each of the seven principles: AI can augment commander competence through decision support, facilitate shared understanding through enhanced common operating pictures, clarify commander's intent through natural language processing and intent extraction, and enable more effective risk assessment through computational modeling. On the other hand, AI integration may also create pressure toward centralized control, as the availability of AI-processed information at higher echelons could 111 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 tempt senior commanders to micromanage subordinate operations—undermining the decentralized execution that mission command requires. Lingel et al. (2020) analyzed the implications of AI integration for joint all-domain command and control, identifying the analytical frameworks necessary for evaluating AI applications in military C2. Their analysis highlighted the tension between AI's potential to accelerate C2 processes and the risk that increased speed without corresponding improvements in human understanding could lead to faster but poorer decisions. Hoehn and Sayler (2022) provided a comprehensive analysis of AI and national security implications, documenting how AI integration into C2 was reshaping the relationship between technological capability and military strategy. The Center for Strategic and International Studies (2023) examined the state of DoD AI and autonomy policy, finding that while the institutional commitment to AI integration was strong, the policy frameworks governing AI's role in C2 decision-making remained underdeveloped. Joint All-Domain Command and Control (JADC2) Joint All-Domain Command and Control (JADC2) represents the U.S. Department of Defense's overarching concept for integrating command and control capabilities across all military domains—land, air, sea, space, and cyberspace—and all military services into a unified C2 architecture enabled by AI and advanced networking. Hoehn (2022) provided a comprehensive analysis of the JADC2 concept, documenting its architecture, development status, and implementation challenges. The JADC2 vision posits that AI-enabled processing of multi-domain sensor data, combined with machine-speed analysis and recommendation generation, will enable military commanders to make decisions faster and more effectively than 112 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 adversaries, achieving the information and decision advantage that Boyd's OODA loop framework identifies as the key to competitive success. The JADC2 architecture envisions AI systems performing a range of C2 functions including sensor fusion across multiple domains, automated target identification and tracking, course of action generation and evaluation, resource allocation optimization, and battle damage assessment. The scope and ambition of these functions make JADC2 the most comprehensive effort to date to integrate AI into the full spectrum of military C2 activities. However, as Lingel et al. (2020) observed, the JADC2 concept raises profound questions about the appropriate role of human decision-makers in an AI-enabled C2 architecture. If AI systems can process information and generate recommendations faster and more comprehensively than human staff officers, what is the human's role in the C2 process? How should decision authority be allocated between human commanders and AI systems across different decision types and operational conditions? These questions are directly relevant to the present dissertation's focus on dynamic autonomy management, as JADC2 provides the operational context within which autonomous weapons systems will be employed and controlled. The dynamic allocation of decision authority between human commanders and autonomous weapons systems cannot be considered in isolation from the broader JADC2 architecture within which those decisions are made. The National Security Commission on Artificial Intelligence (2021) recognized this connection in its final report, recommending that the Department of Defense develop frameworks for AI-enabled decision-making that preserve human judgment for the most consequential decisions while leveraging AI speed and analytical capacity for decisions where speed is essential and the consequences are within acceptable bounds. 113 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 AI Decision Support in Military C2 The application of AI to military decision support has demonstrated both significant potential and persistent challenges. Strouse et al. (2024) developed scalable interactive machine learning approaches for future command and control, demonstrating that machine learning systems could be trained through interaction with military decision-makers to provide increasingly relevant and useful decision support over time. Their work highlighted the importance of designing AI decision support systems that learn from and adapt to the specific decision-making styles and preferences of individual commanders, rather than imposing a standardized analytical framework. Cummings (2017) provided an influential analysis of AI and the future of warfare, examining the cognitive implications of AI decision support for military commanders. Her analysis identified several critical challenges including the risk that AI decision support would create cognitive tunneling—where commanders focus on AI-recommended courses of action at the expense of creative alternatives—and the difficulty of maintaining appropriate skepticism toward AI recommendations when the AI system has a track record of superior analytical performance. These concerns are consistent with the broader automation bias literature (Parasuraman & Manzey, 2010) and underscore the importance of designing AI decision support systems that augment rather than replace human analytical processes. The speed-accuracy tradeoff in AI-augmented military decision-making represents a particularly consequential challenge. Burdette et al. (2026) analyzed how AI could reshape essential competitions in future warfare, finding that AI-enabled decision-making offered significant advantages in speed but that increased speed could come at the cost of decision quality if human oversight was inadequate. Morgan et al. (2020) examined the ethical 114 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 dimensions of this tradeoff in their analysis of military applications of AI, arguing that the pressure to match adversary decision speed could drive military organizations toward levels of automation that compromise meaningful human control and ethical accountability. The Government Accountability Office (2022) identified this tension in its assessment of DoD AI strategies, recommending that the Department develop clearer guidance on the appropriate balance between AI speed and human oversight in different decision contexts. Meaningful Human Control and Governance The concept of meaningful human control has emerged as the dominant framework for governing the employment of autonomous weapons systems, providing a standard against which the adequacy of human oversight can be assessed without requiring a return to fully manual control. This section examines the theoretical development of the meaningful human control concept, the U.S. policy framework governing autonomous weapons, international governance efforts, accountability and responsibility frameworks, and the technical mechanisms through which human control can be implemented and maintained. The Concept of Meaningful Human Control Santoni de Sio and van den Hoven (2018) provided the foundational philosophical account of meaningful human control over autonomous systems. Their framework defined meaningful human control as requiring two conditions: a tracking condition, which demands that the system's behavior track the reasons and intentions of the human controller, and a tracing condition, which demands that the system's behavior be attributable to the human controller in a way that supports moral responsibility. This dual-condition framework provided the first rigorous philosophical specification of what meaningful human control requires, moving the 115 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 discourse beyond vague appeals to "human in the loop" toward a substantive account of the relationship between human agency and machine action. Ekelhof (2019) advanced the operationalization of meaningful human control by arguing for a shift from semantic debates about definitions toward practical analysis of how meaningful human control can be maintained in actual military operations. Her framework emphasized the operational context of control, arguing that what constitutes meaningful control depends on the specific circumstances of employment, including the nature of the target, the threat to friendly forces, the availability of communication, and the time available for human decision-making. Mecacci and Santoni de Sio (2020) extended the meaningful human control framework to dualmode systems that can operate in both autonomous and human-controlled modes, developing the concept of reason-responsiveness as a criterion for meaningful control that applies regardless of which mode is active. Cavalcante Siebert et al. (2023) made a significant contribution by identifying actionable design properties for AI systems that support meaningful human control. Their analysis translated the abstract philosophical concept into concrete engineering requirements, including properties related to system transparency, human intervention capability, decision traceability, and value alignment. Veluwenkamp (2023) provided a complementary philosophical analysis of what autonomous systems should track to maintain meaningful human control, developing the concept of reasons-responsiveness as a more precise specification of the tracking condition that Santoni de Sio and van den Hoven (2018) identified. Horowitz and Scharre (2015) provided a practitioner-oriented primer on meaningful human control in weapon systems, bridging the gap between philosophical analysis and defense policy by articulating practical criteria for assessing whether a given weapon system affords meaningful human control. 116 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Roff and Moyes (2016) examined meaningful human control in relation to artificial intelligence and autonomous weapons, arguing that the increasing sophistication of AI decisionmaking in weapons systems required correspondingly more sophisticated frameworks for ensuring that human control remained meaningful rather than nominal. Their analysis distinguished between formal control (the technical ability to intervene) and effective control (the practical ability to exercise informed judgment), arguing that meaningful human control requires both. Amoroso and Tamburrini (2020) examined the ethical and legal dimensions of meaningful human control, arguing that the concept must encompass not only the technical capacity for human intervention but also the cognitive conditions—including situation awareness, understanding of system behavior, and adequate time for deliberation—that enable informed human judgment. DoD Directive 3000.09 and U.S. Policy Framework DoD Directive 3000.09, Autonomy in Weapon Systems, constitutes the primary U.S. policy instrument governing the development and employment of autonomous weapons systems. Originally issued in 2012 and updated in January 2023, the directive establishes the policy framework within which all U.S. autonomous weapons development, testing, and employment must occur (U.S. Department of Defense, 2023). The directive distinguishes between autonomous and semi-autonomous weapons systems and establishes different approval authorities and review processes for each category. For autonomous weapons systems—defined as systems that, once activated, can select and engage targets without further human input—the directive requires senior-level review and approval through the Autonomous Weapons Systems Senior Review Group. 117 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The 2023 update to Directive 3000.09 reflected several significant developments in both technology and policy thinking since the original 2012 issuance. The update refined the definitions of autonomous and semi-autonomous weapons systems, updated the review and approval processes, and incorporated language reflecting the Department's 2019 AI principles and subsequent policy developments. The Congressional Research Service (2020) provided a comprehensive analysis of U.S. policy on lethal autonomous weapons systems, noting that the directive's approach of requiring human oversight while not categorically prohibiting autonomous weapons represents a middle ground between advocates of autonomous weapons and their opponents. The Defense Innovation Board's (2019) AI principles—responsible, equitable, traceable, reliable, and governable—provided additional guidance that complemented the directive's regulatory framework. The broader U.S. policy landscape for military AI and autonomy includes the 2018 Department of Defense AI Strategy (U.S. Department of Defense, 2019), which established the Department's vision for leveraging AI to maintain military advantage while ensuring that AI systems operate in a manner consistent with American values and legal obligations. The establishment of the Chief Digital and Artificial Intelligence Officer (U.S. Department of Defense, 2022) consolidated AI governance authority and signaled the Department's intent to accelerate AI adoption while maintaining centralized oversight of autonomous system development. The National Security Commission on Artificial Intelligence (2021) recommended a comprehensive approach to AI governance that included strengthening Directive 3000.09's provisions, developing more detailed guidelines for autonomous weapons testing and evaluation, and establishing mechanisms for regular review and update of autonomous weapons policies as technology evolves. 118 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 International Governance Efforts International efforts to govern autonomous weapons have centered on the United Nations Convention on Certain Conventional Weapons (CCW), which has served as the primary multilateral forum for discussions on lethal autonomous weapons systems since 2013. The CCW Group of Governmental Experts on Lethal Autonomous Weapons Systems has convened annually to discuss the legal, ethical, and technical dimensions of autonomous weapons, but has failed to achieve consensus on binding regulations despite years of deliberation (Congressional Research Service, 2019). The UNIDIR (2025) analysis of the interpretation and application of international humanitarian law to lethal autonomous weapons systems provided the most comprehensive recent assessment of the legal challenges, documenting significant disagreements among states regarding the applicability and interpretation of existing IHL provisions to autonomous weapons. State positions on autonomous weapons governance span a wide spectrum. Some states, including Austria, Brazil, and Chile, have called for a preemptive ban on fully autonomous weapons, while others, including the United States, United Kingdom, and Russia, have argued that existing international humanitarian law provides an adequate framework for governing autonomous weapons and that specific new regulations are premature given the evolving state of technology. The SIPRI analysis of limits on autonomy in weapon systems (Boulanin et al., 2020) examined the practical elements of human control that could serve as the basis for international governance, identifying specific technical and operational requirements that could be translated into regulatory standards. The SIPRI (2023) report on international humanitarian law and autonomous weapons systems identified the key legal questions that governance frameworks 119 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 must address, including the nature and degree of human-machine interaction required to satisfy IHL obligations. NATO has developed its own governance frameworks for military AI, including the NATO principles of responsible use for artificial intelligence in defence (NATO, 2024) and the revised NATO AI strategy (NATO, 2024). These frameworks emphasize six principles— lawfulness, responsibility, explainability, reliability, governability, and bias mitigation—that must be satisfied by AI systems employed in defense applications. While these principles do not directly address autonomous weapons governance, they provide the normative framework within which NATO allies are developing their approaches to autonomous weapons employment and oversight. The convergence of these principles with the requirements identified in the academic literature on meaningful human control suggests an emerging consensus on the conditions that must be satisfied for responsible autonomous weapons employment, even as significant disagreements persist about how those conditions should be implemented and enforced. Accountability and Responsibility Frameworks The question of accountability for actions taken by autonomous weapons systems—the so-called "responsibility gap"—represents one of the most challenging theoretical and practical problems in autonomous weapons governance. Matthias (2004) provided the foundational analysis of this gap, arguing that machine learning systems that modify their own behavior based on experience create a situation in which no human agent possesses sufficient knowledge of or control over the system's decision processes to be held fully responsible for its actions. This responsibility gap is particularly acute for autonomous weapons systems, where the actions in question may constitute violations of international humanitarian law resulting in civilian casualties or other serious harms. 120 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Sparrow (2007) extended Matthias's analysis to the specific context of lethal autonomous weapons, examining the potential loci of responsibility—the programmer, the commanding officer, and the machine itself—and arguing that none can bear full moral responsibility for the actions of a truly autonomous weapon. His analysis concluded that this inability to assign responsibility constitutes a sufficient ethical reason to prohibit fully autonomous weapons. Champagne and Tonkens (2015) offered a contrasting perspective, proposing a framework for bridging the responsibility gap in automated warfare that distributes responsibility across multiple actors in the weapons employment chain according to their respective contributions to and knowledge of the system's actions. Crootof (2015) examined the legal implications of autonomous weapons from a domestic law perspective, analyzing how existing legal frameworks for weapons employment could be adapted or extended to address the unique accountability challenges posed by autonomous systems. Bode and Huelss (2018) examined how autonomous weapons systems interact with and potentially transform international norms governing the use of force, arguing that the deployment of increasingly autonomous weapons is gradually shifting the normative landscape in ways that existing governance frameworks may not adequately capture. The Stockholm International Peace Research Institute (2022) analyzed the specific challenge of retaining human responsibility in the development and use of autonomous weapons systems, identifying institutional, procedural, and technical mechanisms through which responsibility can be maintained even as the level of machine autonomy increases. Technical Implementation of Human Control Mechanisms The technical implementation of human control mechanisms for autonomous weapons systems must translate the abstract requirements of meaningful human control into concrete 121 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 system design features. Article 36 (2013) identified specific technical requirements for maintaining human control over autonomous weapons, including transparent and predictable system behavior, reliable override and abort capabilities, and adequate information presentation to support informed human decision-making. Bode and Watts (2023) examined lessons from air defense systems for lethal autonomous weapons governance, documenting how existing air defense systems implement human control mechanisms and identifying the limitations and failures of these mechanisms that inform requirements for next-generation autonomous weapons. The Stop Killer Robots campaign (2022) documented ten examples of increasing autonomy in weapons systems, analyzing the human control mechanisms implemented in each and identifying gaps between the control mechanisms provided and the requirements of meaningful human control. Their analysis revealed that many existing weapons systems with autonomous capabilities implement only minimal human control mechanisms—typically an on/off switch and a geographic boundary—that may satisfy the formal requirement for human involvement without providing the substantive oversight that meaningful human control demands. Human Rights Watch and International Human Rights Clinic (2020) proposed specific elements for a treaty on autonomous weapons, including technical requirements for human control mechanisms that would ensure meaningful rather than nominal human involvement in weapons employment decisions. The DARPA Assured Autonomy program (DARPA, 2019) addressed the verification and validation challenges that underpin technical implementation of human control. The program developed methods for providing formal guarantees of autonomous system behavior within specified parameters, contributing to the technical foundation for human control mechanisms that can provide commanders with warranted confidence in autonomous system behavior. The 122 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 program's results demonstrated that while full formal verification of complex AI systems remains beyond current capabilities, meaningful bounds on autonomous system behavior can be established for specific operational contexts—a finding that supports the feasibility of contextdependent dynamic autonomy management frameworks that adjust the level of autonomous authority based on the degree of confidence in the system's behavior. Legal and Ethical Frameworks The development and employment of autonomous weapons systems raise profound legal and ethical questions that directly shape the design requirements for dynamic autonomy management frameworks. This section examines the application of international humanitarian law to autonomous weapons, just war theory perspectives, ethical analyses from multiple philosophical traditions, the human dignity implications of algorithmic killing, and civil society perspectives on autonomous weapons governance. International Humanitarian Law Applied to Autonomous Weapons Systems International humanitarian law (IHL), also known as the law of armed conflict, establishes the legal framework governing the conduct of hostilities and the protection of persons affected by armed conflict. Three principles of IHL are particularly relevant to autonomous weapons systems: the principle of distinction, which requires combatants to distinguish between military objectives and civilians or civilian objects; the principle of proportionality, which prohibits attacks that are expected to cause incidental civilian harm excessive in relation to the concrete and direct military advantage anticipated; and the obligation to take precautions in attack, which requires commanders to take all feasible precautions to minimize civilian harm. Dinstein (2016) provided a comprehensive analysis of the conduct of hostilities under IHL that serves as the authoritative reference for applying these principles to modern weapons systems. 123 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Schmitt (2013) provided the most influential legal analysis of autonomous weapons systems under IHL, arguing that existing IHL provisions are sufficiently flexible to govern autonomous weapons without requiring new legal instruments. His analysis demonstrated that the legality of autonomous weapons under IHL depends not on the degree of autonomy per se but on whether the system, as employed in a given context, satisfies the requirements of distinction, proportionality, and precaution. This context-dependent approach to legal analysis aligns with the function-specific approach to autonomy classification and supports the case for dynamic autonomy management frameworks that adjust autonomy levels based on the specific legal requirements of the operational context. Boothby (2014) examined the influence of new weapons technology on conflict law, analyzing how emerging autonomous capabilities interact with existing legal obligations. His analysis identified specific challenges that autonomous weapons pose for each principle of IHL: for distinction, the challenge of programming systems to reliably distinguish between combatants and civilians in complex operational environments; for proportionality, the challenge of implementing the subjective judgment of expected military advantage that proportionality requires; and for precaution, the challenge of ensuring that autonomous systems take "all feasible precautions" when the definition of feasibility may differ between human and machine decisionmakers. Davison (2017) provided a complementary ICRC analysis of autonomous weapons under IHL, emphasizing the importance of human judgment in the application of IHL principles and questioning whether autonomous systems can replicate the contextual judgment that IHL compliance requires. Thurnher (2012) examined the legal implications of fully autonomous targeting, analyzing the specific legal challenges that arise when no human is directly involved in targeting 124 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 decisions. His analysis concluded that while fully autonomous targeting is not categorically prohibited under IHL, it would require confidence that the autonomous system can comply with IHL requirements across the full range of anticipated employment conditions—a standard that current technology cannot reliably meet. The ICRC (2021) analysis of autonomous weapons under IHL represented the most authoritative institutional statement on the legal issues, identifying the critical legal questions that remain unresolved and the conditions under which autonomous weapons employment could satisfy IHL requirements. The UNIDIR (2025) report provided updated analysis incorporating recent technological developments and state practice, documenting the widening gap between technological capability and legal clarity. Just War Theory and Autonomous Weapons Just war theory provides the broader ethical framework within which the morality of autonomous weapons employment is evaluated. Leveringhaus (2016) provided the most comprehensive just war theory analysis of autonomous weapons to date, examining how the traditional jus ad bellum (right to war) and jus in bello (right conduct in war) categories apply to autonomous weapons systems. His analysis identified several novel ethical challenges that autonomous weapons pose for just war theory, including the question of whether machines can exercise the moral agency that just war theory presupposes in combatants, whether the deployment of autonomous weapons satisfies the just war requirement of right intention, and whether the transfer of lethal decision-making to machines is compatible with the principle of last resort. The question of moral agency is particularly central to the just war theory analysis of autonomous weapons. Just war theory traditionally presupposes that combatants are moral agents capable of exercising moral judgment in the application of lethal force. If autonomous weapons 125 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 systems lack moral agency—as virtually all philosophers and AI researchers agree they do in their current form—then the moral responsibility for their actions must be attributed to human agents in the chain of command. This attribution creates the moral analog of the legal responsibility gap identified by Matthias (2004), raising the question of whether it is morally permissible to employ weapons whose actions cannot be attributed to a specific moral agent's deliberative judgment. Strawser (2010) provided an influential counterargument, proposing a duty to employ unmanned and potentially autonomous weapons systems when doing so would reduce risk to military personnel without increasing risk to civilians. His argument, grounded in the just war principle of minimizing unnecessary harm, suggests that rejecting autonomous weapons on moral grounds may itself be morally problematic if those weapons could reduce overall harm. Robillard (2018) challenged the conceptual foundations of the "killer robot" debate, arguing that the concept of a killer robot as distinct from other weapons involves a category error that has distorted ethical analysis. These contrasting perspectives illustrate the complexity and unsettled nature of the ethical debate over autonomous weapons, which remains far from resolution despite decades of sustained scholarly attention. Ethical Perspectives on Autonomous Weapons Systems The ethical debate over autonomous weapons draws on multiple philosophical traditions, each of which illuminates different dimensions of the moral landscape. Asaro (2012) argued from a human rights perspective that autonomous weapons threaten the dehumanization of lethal decision-making, reducing the act of killing from a deliberative moral judgment to an algorithmic output. His analysis emphasized the intrinsic moral significance of human deliberation in decisions to take human life, arguing that even if autonomous systems could 126 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 achieve equivalent or superior outcomes in terms of IHL compliance, the absence of human moral deliberation from the lethal decision renders it morally deficient. Arkin (2009) provided the most prominent consequentialist counterargument, arguing that autonomous weapons systems could potentially be designed to perform more ethically than human soldiers, who are subject to emotions, fatigue, stress, and cognitive biases that can lead to violations of the laws of war. His proposed ethical governor—a software architecture designed to constrain autonomous system behavior within the boundaries of international humanitarian law and rules of engagement—represented an ambitious attempt to operationalize ethical constraints in autonomous weapons systems. While Arkin's proposal has been criticized on multiple grounds, including the difficulty of formalizing the contextual judgment that ethical decisionmaking requires, it remains the most fully developed technical approach to embedding ethical reasoning in autonomous weapons. Champagne and Tonkens (2015) proposed a framework for bridging the responsibility gap in automated warfare that draws on both deontological and consequentialist reasoning. Their approach distributes responsibility across multiple actors in the weapons employment chain— designers, programmers, commanders, operators, and institutional actors—according to their respective roles and contributions, arguing that while no single actor bears full responsibility, the aggregate of distributed responsibility is sufficient to satisfy moral accountability requirements. This distributed responsibility framework has significant implications for dynamic autonomy management, as the allocation of responsibility depends on the allocation of authority: when more decision authority is allocated to the autonomous system, the distribution of responsibility shifts accordingly. 127 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Horowitz (2016) provided a nuanced ethical assessment of autonomous weapons that situated the debate within the broader context of military ethics and the ethics of emerging technology. His analysis rejected both categorical opposition to and uncritical embrace of autonomous weapons, arguing instead for a contextual ethical approach that evaluates the morality of autonomous weapons employment based on the specific circumstances of use, including the nature of the target, the operational environment, and the available alternatives. This contextual approach to the ethics of autonomous weapons is consistent with the dynamic autonomy management framework proposed in this dissertation, which advocates for contextdependent allocation of decision authority rather than fixed autonomy levels. Human Dignity and the Ethics of Algorithmic Killing The concept of human dignity provides a philosophical foundation for objections to autonomous weapons that transcends the consequentialist framework of harm reduction. Critics argue that subjecting human beings to lethal decisions made by algorithms, regardless of the accuracy of those decisions, violates the inherent dignity of the persons targeted. Asaro (2012) articulated this position most forcefully, arguing that the right to be judged by a human mind capable of empathy, mercy, and contextual moral reasoning is a fundamental aspect of human dignity that cannot be satisfied by algorithmic equivalents, however sophisticated. The International Committee of the Red Cross (2021) echoed this concern in its analysis, noting that the reduction of life-and-death decisions to algorithmic processes raises fundamental questions about the value and dignity of human life. The counterargument to dignity-based objections points to the imperfect reality of human decision-making in armed conflict. Arkin (2009) argued that the idealized image of the human soldier exercising careful moral deliberation before each use of lethal force bears little 128 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 resemblance to the reality of combat, where decisions are frequently made under extreme stress, fear, anger, and cognitive overload. If the practical alternative to algorithmic decision-making is not ideal human judgment but imperfect human judgment degraded by the psychological realities of combat, then dignity-based objections must weigh the indignity of algorithmic targeting against the indignity of being targeted by a stressed, fatigued, or emotionally compromised human combatant. This debate has direct implications for dynamic autonomy management. If one accepts that human dignity requires some form of human involvement in lethal decisions but does not require exclusive human control, then the question becomes what form and degree of human involvement satisfies the dignity requirement. Dynamic autonomy management frameworks that maintain meaningful human involvement in critical targeting decisions—while leveraging autonomous capabilities for functions that do not directly implicate human dignity, such as target detection and tracking—may offer a path through this ethical tension that neither categorical prohibition nor unrestricted autonomy can provide. The Campaign to Stop Killer Robots and Civil Society Perspectives The Campaign to Stop Killer Robots, launched in 2013 by a coalition of nongovernmental organizations led by Human Rights Watch, represents the most prominent civil society effort to prohibit fully autonomous weapons systems. Human Rights Watch (2012) published the foundational report of this movement, "Losing Humanity: The Case Against Killer Robots," which argued that fully autonomous weapons would be unable to comply with international humanitarian law, would create unacceptable accountability gaps, and would undermine human dignity. The report called for a preemptive international ban on fully autonomous weapons before they are developed and deployed. 129 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Human Rights Watch and International Human Rights Clinic (2020) subsequently proposed specific elements for a treaty on autonomous weapons, drawing on precedents from other weapons control regimes including the Convention on Cluster Munitions and the AntiPersonnel Mine Ban Convention. Their proposal identified specific prohibitions, obligations, and implementation mechanisms that a comprehensive treaty on autonomous weapons should include, providing the most detailed civil society blueprint for international regulation. The Stop Killer Robots campaign (2022) supplemented this work with analysis of ten specific weapons systems exhibiting increasing autonomy, documenting the trajectory toward greater autonomous capability and the inadequacy of existing governance mechanisms. The ICRC has maintained a distinct institutional position that, while not calling for a categorical ban, has emphasized the humanitarian and ethical concerns raised by autonomous weapons and advocated for new international rules to address them. The ICRC (2021) called for new legally binding rules on autonomous weapons to ensure human control over the use of force, recommending specific prohibitions on autonomous weapons that target humans and regulations requiring human oversight in the employment of autonomous weapons against other targets. The SIPRI analyses (Boulanin et al., 2020; SIPRI, 2023) have contributed to this governance discourse by providing technically informed analysis of the feasible limits on autonomy in weapons systems and the practical requirements for human control. Explainable AI and Transparency The capacity of autonomous weapons systems to explain their decision processes to human operators is a critical enabler of meaningful human control and effective dynamic autonomy management. This section examines the development of explainable AI capabilities for military applications, the types of explanations required by military operators, evaluation 130 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 methods and standards for XAI, and the specific transparency requirements for autonomous weapons decision-making. DARPA XAI Program and Military Applications The Defense Advanced Research Projects Agency's Explainable Artificial Intelligence (XAI) program, launched in 2017, represented the most ambitious and well-funded effort to develop AI systems capable of explaining their decision processes to human users. Gunning and Aha (2019) described the program's objectives, approach, and early results, noting that the program pursued three technical approaches: modified deep learning to produce more explainable models, alternative machine learning techniques that are inherently more interpretable, and post-hoc explanation methods that generate explanations for the decisions of existing black-box models. Gunning et al. (2019) provided a broader assessment of the XAI field, situating the DARPA program within the larger scientific context and identifying the key technical challenges that remained to be addressed. The XAI program's results demonstrated both the feasibility and the limitations of current approaches to explainability in AI systems. The program achieved significant advances in generating explanations for individual AI decisions, but also revealed that the quality and usefulness of explanations depended critically on the explanation's alignment with the user's cognitive model and decision-making process. Explanations that were technically accurate but presented in formats unfamiliar to the user, or that addressed aspects of the decision that the user did not consider relevant, proved to be of limited value—a finding that underscores the importance of user-centered design in military XAI applications. The military applications of XAI extend beyond autonomous weapons to encompass the full range of AI-enabled decision support in military operations, including intelligence analysis, 131 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 logistics planning, and mission planning. However, the requirements for XAI in autonomous weapons contexts are uniquely demanding due to the lethal consequences of autonomous decisions and the legal requirement for meaningful human control. Liao and Varshney (2022) examined the transition from algorithms to user experiences in human-centered XAI, emphasizing that effective explainability requires not only technically sound explanations but also user interfaces and interaction patterns that enable users to efficiently access, comprehend, and act upon those explanations in operationally relevant timeframes. Explanation Types for Military Operators The types of explanations required by military operators depend on their role, expertise, and decision-making context. Hoffman et al. (2023) provided a comprehensive framework for measuring XAI effectiveness, identifying five dimensions: explanation goodness (the intrinsic quality of the explanation), user satisfaction, mental model accuracy (the degree to which the explanation supports accurate understanding of the AI system's behavior), curiosity (the degree to which the explanation stimulates further inquiry), and trust (the degree to which the explanation supports appropriate trust calibration). Their framework provides the most complete set of metrics for evaluating XAI effectiveness in military applications. Arrieta et al. (2020) provided a comprehensive taxonomy of XAI concepts, techniques, and challenges, organizing the field around the distinction between transparent models (which are inherently interpretable) and post-hoc explainability methods (which generate explanations for opaque models after the fact). Their analysis identified the tradeoff between model accuracy and interpretability that constrains XAI design choices: the most accurate models (typically deep neural networks) are generally the least interpretable, while the most interpretable models (such as decision trees and rule-based systems) typically sacrifice some degree of accuracy. For 132 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 autonomous weapons systems, this tradeoff has direct operational and legal implications, as the choice between accuracy and interpretability affects both the system's performance in target identification and engagement and the human operator's ability to exercise meaningful oversight. Miller (2019) provided an influential analysis of explanation in AI drawing on insights from the social sciences, arguing that effective AI explanations must conform to the cognitive and social norms that govern human-to-human explanation rather than simply providing technically complete accounts of the AI's decision process. His analysis identified several properties of effective explanations including contrastiveness (explaining why the AI chose one action rather than another), selectivity (focusing on the most relevant factors rather than providing exhaustive accounts), and social context sensitivity (adapting the explanation to the audience's knowledge and needs). These properties have direct application to military XAI, where operators require explanations that are tailored to their specific role, expertise level, and decision-making context. XAI Evaluation Methods and Standards The evaluation of XAI systems presents significant methodological challenges, as the effectiveness of an explanation is inherently subjective and context-dependent. Phillips et al. (2020) developed the NIST framework for explainable AI, identifying four principles— explanation, meaningful, explanation accuracy, and knowledge limits—that explainable AI systems should satisfy. Their framework provided the first standardized set of evaluation criteria for XAI, though its application to military contexts requires adaptation to account for the unique demands of military decision-making environments. Doshi-Velez and Kim (2017) proposed a framework for rigorous evaluation of interpretable machine learning that distinguished among application-grounded evaluation (testing with real users on real tasks), human-grounded 133 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 evaluation (testing with real users on simplified tasks), and functionally-grounded evaluation (testing using formal definitions as proxies for human judgment). The application of these evaluation frameworks to military XAI faces the additional challenge that realistic evaluation requires military-domain expertise and operationally relevant scenarios that are difficult to replicate in research settings. The tension between experimental control and ecological validity is particularly acute for XAI evaluation in weapons employment contexts, where the consequences of the decisions being supported are uniquely severe and the operational pressures uniquely intense. The absence of standardized XAI evaluation methods for military applications was identified by the Pokorny (2026) review as a significant gap, reflecting the broader challenge of developing evaluation methodologies that can assess XAI effectiveness under conditions that approximate the stress, time pressure, and uncertainty of actual military operations. Transparency Requirements for Autonomous Weapons Decision-Making The transparency requirements for autonomous weapons decision-making represent the intersection of XAI capabilities with the legal and ethical requirements for meaningful human control. Holland Michel (2020) examined the specific transparency challenges posed by military AI systems, arguing that the inability to fully predict and understand AI system behavior constitutes a fundamental obstacle to responsible autonomous weapons employment. His analysis identified the "black box" problem—the opacity of complex AI decision processes—as particularly acute in military contexts where the consequences of opaque decisions can include loss of human life. The Defense Innovation Board's (2019) AI principles specified traceability as a core requirement for DoD AI systems, mandating that the Department's AI capabilities have an 134 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 auditable methodology, data source, design procedure, and documentation. The NATO principles of responsible use for AI in defence (NATO, 2024) similarly emphasized explainability as a foundational requirement, specifying that AI systems should be appropriately understandable and transparent to relevant personnel. These institutional requirements create a clear demand for XAI capabilities in autonomous weapons systems, but as the technical literature demonstrates, satisfying this demand in practice—particularly under the time constraints of tactical military operations—remains a significant technical and design challenge. For dynamic autonomy management specifically, transparency requirements operate at two levels. First, the autonomous system must be able to explain its individual decisions and recommendations to the human operator, enabling the operator to assess whether those decisions and recommendations are consistent with the rules of engagement, international humanitarian law, and the commander's intent. Second, the dynamic autonomy management system itself must be transparent—the human operator must be able to understand why the system is recommending a particular level of autonomy for a given situation and must retain the ability to override that recommendation based on their own assessment. This dual-level transparency requirement presents a significant design challenge that existing XAI approaches have not fully addressed. Computational Modeling and Simulation Computational modeling and simulation provide essential methodological tools for examining dynamic autonomy management in contexts where empirical field research is constrained by ethical, practical, and security considerations. This section examines the application of agent-based modeling to defense problems, simulation-based analysis of command 135 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 and control, wargaming methodologies, and the validation challenges that computational approaches to military systems entail. Agent-Based Modeling in Defense Agent-based modeling (ABM) has emerged as a particularly powerful computational methodology for examining complex military systems characterized by heterogeneous agents, nonlinear interactions, and emergent behaviors. Ilachinski (2004) provided the foundational treatment of agent-based modeling for military applications in his comprehensive analysis of artificial war through multiagent-based simulation of combat. His work demonstrated that agentbased models could capture dynamics of military operations—including the emergence of collective behaviors from individual agent interactions, the effects of information asymmetries on operational outcomes, and the sensitivity of combat outcomes to organizational structures— that traditional analytical models could not adequately represent. Ilachinski (2009) subsequently developed the EINSTein artificial-life laboratory, an agent-based simulation environment specifically designed for exploring self-organized emergence in land combat. The EINSTein model demonstrated that realistic combat outcomes could emerge from relatively simple agent-level rules governing movement, sensing, communication, and engagement, providing evidence that agent-based approaches could capture essential dynamics of military operations without requiring the prohibitively detailed specification that traditional combat models demand. Moffat (2011) extended the theoretical foundations of ABM for defense applications through his work on complexity theory and network-centric warfare, demonstrating that the computational tools of complexity science— including agent-based models, network analysis, and information-theoretic measures—provide powerful analytical frameworks for understanding the dynamics of modern military operations. 136 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Lauren and Stephen (2002) developed the Map-Aware Non-Uniform Automata (MANA) model, representing a distinct approach to agent-based combat simulation that emphasized the influence of terrain, movement, and spatial relationships on combat outcomes. The MANA model, developed by New Zealand's Defence Technology Agency, has been widely used for scenario analysis in multiple defense organizations, demonstrating the practical utility of agentbased approaches for informing military planning and capability development. Bonabeau (2002) provided a broader methodological foundation for agent-based modeling in social and organizational systems, identifying the conditions under which ABM offers advantages over traditional analytical and simulation approaches—conditions that are frequently present in military command and control environments. The application of ABM to dynamic autonomy management is particularly promising because agent-based models can naturally represent the distributed, interactive, and adaptive nature of human-AI teams in C2 environments. Individual agents can represent human commanders, AI systems, autonomous weapons, and other elements of the C2 architecture, with their interactions governed by rules that reflect the autonomy allocation framework being tested. This enables systematic exploration of how different dynamic autonomy management strategies affect operational outcomes under varying conditions—a capability that is essential for developing and validating dynamic autonomy frameworks that cannot be tested through field experimentation. Simulation-Based Analysis of Command and Control Simulation-based approaches to C2 analysis provide controlled environments for examining how different C2 structures and processes affect operational outcomes. Cioppa et al. (2004) demonstrated the application of agent-based simulations to military problems, showing 137 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 how simulation-based analysis could be used to explore the effects of organizational structure, communication patterns, and decision processes on mission outcomes across a wide range of scenarios. Their work established methodological standards for the design, execution, and analysis of military agent-based simulations that have been widely adopted. Page (2008) provided the broader theoretical context for agent-based modeling as a scientific methodology, articulating the epistemological foundations and methodological standards for agent-based models as tools for understanding complex systems. His analysis identified the conditions under which agent-based models provide scientific insight—including the presence of heterogeneous agents, adaptive behavior, and emergent properties—and the standards that agent-based models must meet to generate credible scientific conclusions. These methodological standards are directly applicable to the use of ABM for examining dynamic autonomy management in military C2, where the heterogeneity of human and AI agents, the adaptive nature of their interactions, and the emergent properties of human-AI teams satisfy the conditions under which ABM offers its greatest analytical value. Wargaming and Computational Models Wargaming provides a complementary methodological tradition for examining military decision-making that bridges the gap between purely computational models and field operations. Perla (1990) provided the definitive guide to the art of wargaming, documenting the history, methodology, and application of wargames as tools for military analysis, education, and planning. His analysis emphasized that wargaming's unique contribution lies in its ability to incorporate human judgment, creativity, and adversarial thinking into analytical frameworks— capabilities that purely computational models cannot replicate. The integration of computational models with wargaming methodologies offers the potential to combine the scalability and 138 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 reproducibility of computational simulation with the human insight and adversarial reasoning that wargaming provides. The application of wargaming to dynamic autonomy management could take several forms. Tabletop exercises structured as wargames could be used to explore how military commanders make decisions about autonomy allocation in realistic scenarios, providing qualitative data on the factors that influence autonomy management decisions and the consequences of different allocation strategies. Computational wargames that combine human players with agent-based models of autonomous systems could be used to test dynamic autonomy management frameworks under conditions that incorporate both human strategic reasoning and the computational speed and scale of autonomous system operations. Morgan et al. (2020) demonstrated the application of RAND methodologies to ethical analysis of military AI, providing a template for rigorous analytical approaches that could be extended to dynamic autonomy management. Validation Challenges for Military Agent-Based Models The validation of agent-based models for military applications presents distinctive challenges that merit careful consideration in any research employing computational simulation. Unlike models of physical systems, which can often be validated against empirical observations, military ABMs seek to represent systems that include human decision-making, organizational dynamics, and adversarial interactions—phenomena that are inherently difficult to validate empirically. Ilachinski (2004) acknowledged these challenges in his treatment of artificial war, noting that validation of military ABMs requires multiple complementary approaches including face validation by subject matter experts, comparison with historical outcomes, sensitivity 139 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 analysis to identify the robustness of model findings, and cross-validation using multiple independent models. The specific validation challenges for ABMs of dynamic autonomy management include the absence of real-world data on dynamic autonomy allocation in weapons employment contexts (since no such frameworks have been operationally deployed), the difficulty of validating models of human decision-making under extreme stress and time pressure, and the challenge of capturing the full complexity of adversarial interactions in which both sides employ autonomous systems with dynamic autonomy management. These validation challenges do not invalidate the use of ABM for dynamic autonomy management research, but they do require transparency about the models' limitations and careful triangulation with other methodological approaches—including the qualitative data from commander interviews and the experimental data from simulation-based testing—to build a robust evidence base for the proposed framework. Synthesis and Identification of Research Gaps This concluding section synthesizes the key findings from across the literature review, identifies the critical research gaps that the present dissertation addresses, presents the conceptual framework that integrates the reviewed literature into a coherent analytical lens, and revisits the research questions in light of the literature's findings. Summary of Key Findings Across the Literature The literature reviewed in this chapter reveals a field characterized by rapid theoretical development, growing institutional commitment, and persistent empirical gaps. Several crosscutting findings emerge from the synthesis of these diverse bodies of scholarship. First, the literature demonstrates broad consensus that fixed levels of autonomy are suboptimal for complex, dynamic military operations, and that some form of dynamic or adaptive autonomy 140 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 management is necessary to balance operational effectiveness with meaningful human control. This consensus spans the human factors literature (Parasuraman et al., 2000; Kaber & Endsley, 2004; Endsley, 2017), the robotics literature (Dorais et al., 1999; Goodrich et al., 2001; Sellner et al., 2006), and the military policy literature (U.S. Department of Defense, 2023; NATO, 2024), suggesting that the need for dynamic autonomy management is not merely an academic construct but a practical requirement recognized across disciplines and institutions. Second, the trust literature consistently demonstrates that trust calibration—the alignment of operator trust with actual system capability—is the critical mediating variable in humanautomation interaction, and that both overtrust and undertrust produce dangerous outcomes (Lee & See, 2004; Hoff & Bashir, 2015; Schaefer et al., 2016). In the context of autonomous weapons, trust miscalibration carries uniquely severe consequences: overtrust could lead to unlawful engagements or civilian casualties, while undertrust could lead to failure to employ lawful defensive capabilities, resulting in unnecessary friendly casualties. The literature identifies multiple factors that influence trust calibration, including system transparency, explanation quality, prior experience, organizational culture, and individual differences, but provides limited guidance on how to manage trust dynamically in environments where the operational context is continuously changing. Third, the legal and ethical literature establishes meaningful human control as the dominant standard for autonomous weapons governance but provides limited operationalization of what meaningful human control requires in specific operational contexts (Santoni de Sio & van den Hoven, 2018; Ekelhof, 2019; Cavalcante Siebert et al., 2023). The gap between the abstract philosophical concept and its practical implementation in weapons systems design remains wide, and the literature offers few empirically validated guidelines for translating 141 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 meaningful human control requirements into specific design features, operational procedures, or autonomy allocation strategies. Fourth, the command and control literature demonstrates that AI integration into C2 is proceeding rapidly at the institutional level, with major initiatives including JADC2, the establishment of the Chief Digital and AI Officer, and the NATO AI Strategy creating organizational momentum for AI-enabled C2 (Hoehn, 2022; Lingel et al., 2020; NATO, 2024). However, the doctrinal and procedural frameworks for managing the human-AI relationship in C2—including the allocation of decision authority between human commanders and AI systems—have not kept pace with the technical capabilities being developed. The mission command philosophy, while philosophically compatible with human-AI collaboration, has not been formally adapted to incorporate AI agents as participants in the C2 process. Fifth, the computational modeling literature provides powerful methodological tools for examining dynamic autonomy management (Ilachinski, 2004; Moffat, 2011; Lauren & Stephen, 2002), but these tools have not yet been systematically applied to the specific problem of dynamic authority allocation between human commanders and autonomous weapons systems. The agent-based modeling paradigm, with its capacity to represent heterogeneous agents, adaptive behaviors, and emergent properties, appears particularly well-suited to this research problem, but its potential remains largely unrealized in this domain. Critical Research Gaps The synthesis of this literature reveals several critical research gaps that the present dissertation directly addresses. The most fundamental gap is the absence of any empirically validated framework for dynamically managing the allocation of decision authority between human commanders and autonomous weapons systems across different operational contexts. 142 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 While the theoretical foundations for dynamic autonomy management are well-established across multiple disciplines, no study has integrated these foundations into a comprehensive framework and subjected that framework to empirical testing in weapons-employment scenarios. This gap exists at the intersection of the human factors literature, which provides the cognitive and performance foundations; the trust literature, which identifies the mediating variables; the legal and ethical literature, which establishes the normative constraints; and the military operations literature, which defines the practical requirements. A second critical gap concerns the absence of empirically tested transfer-of-control protocols for autonomous weapons systems—standardized procedures for shifting decision authority between human operators and autonomous systems while maintaining operational effectiveness and meaningful human control. The literature identifies the need for such protocols (Dorais et al., 1999; Feigh et al., 2012), but none have been developed, tested, or validated for the specific requirements of weapons employment contexts, where the consequences of failed control transitions include potential violations of international humanitarian law. A third gap concerns the absence of empirical comparison of C2 architectures—humanin-the-loop, human-on-the-loop, and human-over-the-loop—across different types of military operations with measurable outcomes for both operational effectiveness and accountability. While the conceptual distinctions among these architectures are well-established, no empirical study has compared their performance across a range of operational scenarios, leaving military decision-makers without an evidence base for selecting the appropriate C2 architecture for specific mission requirements. A fourth gap concerns the measurement of trust in military human-AI teams. While multiple trust measurement instruments exist (Jian et al., 2000; Schaefer et al., 2016), the 143 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 literature lacks a validated multi-modal trust measurement approach that combines physiological biomarkers, behavioral indicators, and self-report measures specifically calibrated for military human-AI teaming in weapons employment contexts. This measurement gap impedes both research on trust dynamics and the development of trust-aware dynamic autonomy management systems that could adjust autonomy levels based on real-time assessment of operator trust. Conceptual Framework for the Present Study The conceptual framework for the present study integrates three complementary theoretical lenses identified through this literature review. The first lens is the levels of automation framework, grounded in Parasuraman et al.'s (2000) four-stage model and extended by Endsley's (2017, 2018) work on autonomy design. This framework provides the analytical structure for specifying autonomy levels across different functions in the engagement decision cycle—from information acquisition through action implementation—and for defining the parameter space within which dynamic autonomy management operates. The second lens is the meaningful human control framework, grounded in Santoni de Sio and van den Hoven's (2018) philosophical account and operationalized through the actionable design properties identified by Cavalcante Siebert et al. (2023). This framework provides the normative constraints that any dynamic autonomy management scheme must satisfy: the tracking condition (that system behavior reflects human reasons and intentions) and the tracing condition (that system behavior can be attributed to human agents for purposes of moral and legal accountability). These conditions establish the boundaries within which autonomy can be dynamically allocated, ensuring that no allocation strategy sacrifices meaningful human control for operational convenience. 144 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The third lens is complex adaptive systems theory, which provides the analytical framework for understanding the emergent dynamics of human-AI teams in C2 environments. This lens recognizes that the interaction between human cognitive processes and AI decisionmaking algorithms in high-stakes environments produces emergent behaviors that cannot be predicted from knowledge of individual components alone, necessitating computational modeling approaches—particularly agent-based modeling—that can capture these emergent dynamics. The integration of these three lenses—the analytical structure of levels of automation, the normative constraints of meaningful human control, and the dynamic perspective of complex adaptive systems—provides the multidimensional conceptual framework required to address the dissertation's research questions. Research Questions Revisited In light of the comprehensive literature review presented in this chapter, the dissertation's three research questions can now be situated within the broader scholarly context and their significance more precisely articulated. Research Question 1—How should decision authority be dynamically allocated between human commanders and autonomous weapons AI across different operational phases?— addresses the most fundamental gap in the literature: the absence of a framework specifying how the allocation of functions between humans and AI should vary across the engagement decision cycle. The levels of automation literature (Parasuraman et al., 2000) provides the analytical vocabulary for this allocation, but the specific allocation decisions for weapons employment contexts have not been empirically investigated. The naturalistic decision-making literature (Klein, 1998) establishes that these allocation decisions must be compatible with the patternrecognition and mental-simulation processes through which experienced commanders make 145 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 decisions, while the trust literature (Lee & See, 2004; Hoff & Bashir, 2015) identifies the mediating variables that influence whether operators will actually rely on autonomous capabilities at the allocated level. Research Question 2—What transfer-of-control protocols preserve meaningful human agency without degrading operational tempo below mission-critical thresholds?—addresses the practical challenge of implementing dynamic autonomy management in time-critical military operations. The meaningful human control literature (Santoni de Sio & van den Hoven, 2018; Ekelhof, 2019) establishes the normative requirements that transfer protocols must satisfy, while the ironies of automation literature (Bainbridge, 1983; Endsley & Kiris, 1995) identifies the cognitive risks associated with autonomy transitions. The mixed-initiative systems literature (Miller & Parasuraman, 2007; Shively et al., 2018) provides design concepts, but none have been validated for weapons employment contexts. Research Question 3—How do different C2 architectures affect both operational effectiveness and accountability in autonomous weapons employment?—addresses the need for comparative evidence on the relative merits of different approaches to human-AI authority allocation. The C2 literature (Boyd, 1996; Alberts & Hayes, 2003, 2006) provides the theoretical foundations for different C2 architectures, while the governance literature (U.S. Department of Defense, 2023; NATO, 2024) establishes the accountability requirements that these architectures must satisfy. The absence of empirical comparison across C2 architectures leaves military decision-makers without the evidence base needed to make informed choices about how to integrate autonomous weapons into their command structures. The present dissertation, through its sequential mixed-methods design integrating qualitative grounded theory, agent-based computational modeling, simulation-based 146 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 experimentation, and tabletop exercise validation, is designed to address these gaps and provide the first empirically grounded dynamic autonomy management framework for human-AI command and control of autonomous weapons systems. 147 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 CHAPTER 3: METHODOLOGY Introduction This chapter presents the research methodology for investigating dynamic autonomy management in human-AI command and control (C2) for autonomous weapons systems (AWS). The methodological design addresses the central problem identified in the preceding chapters: the absence of empirically validated frameworks for dynamically allocating decision authority between human commanders and autonomous weapons AI across the spectrum of military operations. As established in Chapter 2, the existing literature on autonomous weapons ethics, C2 architecture design, and trust calibration, while individually well-developed, has not been integrated into a cohesive framework validated against operational military requirements (Pokorny, 2026). The methodology described herein is designed to close that gap through a systematic, multi-phase research program that moves from qualitative exploration to computational modeling, experimental testing, and operational validation. The chapter is organized as follows. First, the overall research design and its rationale are presented, establishing the sequential mixed-methods architecture that integrates four distinct but interdependent research phases. Second, the research questions and associated hypotheses or analytic propositions are restated and mapped to specific methodological phases. Third, the population, sampling strategies, and units of analysis are defined for each phase. Fourth, data sources and collection procedures are described in detail, with reference to the consolidated dataset inventory compiled for this study. Fifth, core constructs are operationally defined and linked to specific measurement strategies. The chapter then presents the detailed procedures for each of the four research phases, followed by the data analysis plan, reliability and validity considerations, ethical safeguards, and methodological limitations. Throughout, every 148 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 methodological choice is justified from the literature and aligned with the dissertation's aims of producing findings that are directly actionable for Joint Chiefs of Staff doctrine development and autonomous weapons governance policy. The three research questions guiding this study require investigation at multiple levels of analysis—from the granular coding of policy documents and practitioner discourse, through the computational simulation of complex adaptive human-AI systems, to the controlled experimental comparison of C2 architectures and the doctrinal validation of the resulting framework. No single methodological tradition is adequate to address this range. The sequential mixed-methods design adopted here reflects the recommendation of Creswell and Plano Clark (2018) that complex, multifaceted research problems in applied fields benefit from the complementary strengths of qualitative and quantitative approaches, integrated through deliberate sequencing and explicit points of methodological connection. The design also responds to the Pokorny (2026) systematic review's explicit call for research that moves beyond laboratory simulations and self-report measures toward multi-modal measurement, longitudinal field data, and realistic stress conditions that approximate genuine operational environments. Research Design and Rationale Overview of the Mixed-Methods Sequential Design This dissertation employs a four-phase sequential mixed-methods design that integrates qualitative grounded theory development, agent-based computational modeling, simulationbased experimentation, and tabletop exercise validation. The design follows an exploratory sequential structure (Creswell & Plano Clark, 2018) in which the qualitative phase generates theoretical constructs and model parameters that are then tested and refined through successive quantitative and applied phases. Each phase builds upon the outputs of its predecessor, creating a 149 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 cumulative evidence base that culminates in a validated dynamic autonomy management framework suitable for doctrinal adoption. Phase 1 employs grounded theory methodology (Strauss & Corbin, 1998) to develop an empirically informed model of dynamic autonomy management requirements from publicly available policy documents, congressional testimony, government audit reports, and think-tank analyses. The grounded theory model identifies the key constructs, relationships, and boundary conditions that characterize effective human-AI authority allocation in weapons-employment C2 contexts. Phase 2 translates these qualitative findings into a computational agent-based model (ABM) that simulates dynamic authority allocation across three C2 architectures—human-in-theloop, human-on-the-loop, and human-over-the-loop—under varying conditions of operational tempo, threat complexity, and ethical ambiguity. Phase 3 conducts controlled experiments using a purpose-built simulation environment to compare the three C2 architectures with human participants across engagement scenarios of varying complexity and time pressure. Phase 4 validates the resulting dynamic autonomy management framework through structured tabletop exercises with defense professionals, assessing framework usability, doctrinal compatibility, and accountability traceability. Rationale for the Mixed-Methods Approach The mixed-methods sequential design is justified on several grounds. First, the research problem spans domains that are epistemologically distinct: the qualitative exploration of how decision authority is conceptualized in policy and practice requires interpretive methods, while the comparative assessment of C2 architectures demands the precision of experimental and computational approaches. Johnson and Onwuegbuzie (2004) argued that mixed-methods research provides a logical and practical alternative when neither qualitative nor quantitative 150 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 methods alone are sufficient to capture the complexity of a phenomenon. The dynamic allocation of authority between humans and AI in lethal-force contexts is precisely such a phenomenon, involving normative judgments, cognitive processes, organizational dynamics, and measurable performance outcomes that no single method can adequately address. Second, the sequential structure ensures that each phase is empirically grounded in the findings of its predecessor rather than relying on a priori assumptions. The grounded theory phase ensures that the computational model reflects the actual concerns, constraints, and decision processes articulated in the policy and practitioner literature, rather than the researcher's preconceptions. The agent-based model, in turn, generates specific predictions about the performance of different C2 architectures that can be tested experimentally. The tabletop exercise validates the practical applicability of the framework in a context that approximates doctrinal development processes used by the Joint Staff. This sequential logic follows Morse's (2003) principle of methodological coherence, in which the research design is driven by the nature of the research questions rather than by methodological convenience. Third, the design addresses the specific methodological gaps identified in the Pokorny (2026) systematic review, which found that existing research on human-AI teaming in military contexts is overwhelmingly Western-centric (73%), simulation-based, and reliant on selfreported metrics. The present design mitigates these limitations by incorporating multiple data modalities (document analysis, computational simulation, behavioral measurement, expert evaluation), employing multi-modal trust measurement that combines validated self-report scales with behavioral indicators, and grounding the entire framework in publicly available operational data rather than simulated scenarios alone. 151 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Integration Across Phases Integration is the hallmark of rigorous mixed-methods research and the mechanism through which the complementary strengths of different methods are realized (Fetters et al., 2013). In this design, integration occurs at four points. First, the grounded theory constructs and propositions generated in Phase 1 directly parameterize the agent-based model in Phase 2, ensuring that the computational model reflects empirical rather than hypothetical relationships. Second, the ABM outputs—specifically, the predicted performance differences among C2 architectures under varying conditions—generate the experimental hypotheses tested in Phase 3. Third, the experimental results are translated into a tabletop exercise framework that presents the findings in operationally meaningful terms for Phase 4 validation. Fourth, the meta-inferences drawn at the conclusion of the study synthesize qualitative, computational, experimental, and evaluative evidence into a unified dynamic autonomy management framework, following the convergent integration strategy described by Creswell and Plano Clark (2018). 152 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Figure 3.1 Four-Phase Sequential Mixed-Methods Research Design Phase Key Activities Primary Outputs Integration Phase 1 Qualitative Grounded Theory Document corpus development; Open/axial/selective coding; Constant comparison Grounded theory model; Codebook; Theoretical propositions Informs model parameters for Phase 2 Phase 2 Agent-Based Computational Modeling Model parameterization; C2 architecture simulation; Sensitivity analysis Calibrated ABM; Performance metrics; Sensitivity results Generates scenarios and hypotheses for Phase 3 Phase 3 Simulation-Based Experimentation Scenario-based experiments; 3 C2 architectures; 120+ participants Experimental data; Trust measures; Performance comparisons Produces empirical results for Phase 4 validation Phase 4 Tabletop Exercise Validation Doctrinal validation; Expert evaluation; Framework refinement Validated framework; Policy recommendations; Doctrine inputs Triangulates all phases into final framework Note. Each phase builds sequentially on the outputs of its predecessor. Arrows indicate primary data flow; the integration column describes how each phase's outputs inform subsequent phases. The design follows an exploratory sequential structure (Creswell & Plano Clark, 2018) culminating in a validated dynamic autonomy management framework. Research Questions and Hypotheses This dissertation is guided by three overarching research questions, each of which is decomposed into testable hypotheses or analytic propositions mapped to specific research phases. The research questions were originally articulated in the dissertation proposal and are grounded in the critical gaps identified through the systematic literature review (Pokorny, 2026) and the comprehensive literature review presented in Chapter 2. Research Question 1 RQ1: How should decision authority be dynamically allocated between human commanders and autonomous weapons AI across different operational phases (surveillance, identification, tracking, engagement, and post-engagement assessment)? 153 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 This question addresses the core design challenge of dynamic autonomy management: determining the optimal allocation of decision authority at each stage of the engagement decision cycle. The question is explored qualitatively in Phase 1 through the identification of allocation principles and constraints in the policy literature, computationally in Phase 2 through ABM simulation of alternative allocation strategies, and experimentally in Phase 3 through the comparison of fixed versus dynamic allocation conditions. The following hypotheses and propositions are derived from this question: Proposition 1a (Phase 1): Policy documents, congressional testimony, and expert analyses will reveal a consistent set of contextual factors—including threat imminence, rules of engagement specificity, target discrimination difficulty, and collateral damage risk—that practitioners identify as determinants of appropriate autonomy level allocation across the engagement decision cycle. Hypothesis 1b (Phase 2): In agent-based simulations, dynamic autonomy allocation strategies that adjust authority level based on operational-phase-specific triggers will produce significantly higher decision quality scores and lower accountability-chain failures than fixedallocation strategies across a range of operational tempos. Hypothesis 1c (Phase 3): Participants operating under dynamic autonomy allocation protocols will demonstrate superior decision accuracy and rules-of-engagement adherence compared to participants operating under fixed human-in-the-loop or fixed human-on-the-loop conditions, as measured by scenario performance metrics. Research Question 2 RQ2: What transfer-of-control protocols preserve meaningful human agency without degrading operational tempo below mission-critical thresholds? 154 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 This question focuses on the design of transition mechanisms between autonomy levels, directly addressing the tension between meaningful human control and operational effectiveness identified by Scharre (2018) and the gap in validated transfer-of-control protocols noted in the research gaps analysis (AWS-3, C2-3). The following hypotheses are advanced: Proposition 2a (Phase 1): Grounded theory analysis will identify a typology of transferof-control triggers—including time-based, event-based, confidence-threshold-based, and operator-initiated triggers—along with associated verification checkpoints and fallback mechanisms that practitioners consider essential for maintaining meaningful human agency. Hypothesis 2b (Phase 2): Transfer-of-control protocols incorporating graduated verification checkpoints will produce response latencies within mission-critical thresholds (operationally defined per scenario) while maintaining accountability chain integrity above 90%, as measured in agent-based simulations. Hypothesis 2c (Phase 3): Operators employing structured transfer-of-control protocols will report higher perceived agency (meaningful human control), as measured by adapted meaningful human control scales, without statistically significant degradation in engagement timeline compliance compared to fully automated conditions. Research Question 3 RQ3: How do different C2 architectures (human-in-the-loop, human-on-the-loop, humanover-the-loop) affect both operational effectiveness and accountability traceability in autonomous weapons employment? This question directly addresses the comparative assessment gap (C2-5) identified in the research gaps analysis. No empirical study has compared these three C2 paradigms across 155 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 operational scenarios with measurable outcomes for both effectiveness and accountability. The following hypotheses are derived: Hypothesis 3a (Phase 2): ABM simulations will reveal significant main effects of C2 architecture type on the dependent variable vector (decision quality, response latency, accountability chain integrity, operator cognitive load), with interaction effects between architecture type and operational tempo. Hypothesis 3b (Phase 3): Human-on-the-loop and human-over-the-loop conditions will produce faster response latencies than human-in-the-loop but lower accountability traceability scores, as measured by documentation completeness and decision-trail auditability metrics. Hypothesis 3c (Phase 3): The interaction between C2 architecture type and scenario complexity will be significant, such that the performance advantages of higher-autonomy architectures increase under high-tempo conditions but diminish or reverse under ethically ambiguous conditions requiring nuanced discrimination judgments. Table 3.1 Research Questions, Data Sources, Variables, and Analytic Methods RQ/Hypothesis Phase(s) RQ1 / P1a 1 RQ1 / H1b Data Sources Key Variables Analysis Method Congressional testimony, GAO reports, DTIC documents, think-tank publications Decision authority allocation patterns, contextual triggers, operational phase Grounded theory; constant comparative coding 2 ABM simulation outputs; DoDD 3000.09 parameters; CRS/SIPRI data Decision quality, accountability chain integrity, autonomy allocation strategy Sensitivity analysis; MANOVA on simulation outputs RQ1 / H1c 3 Simulation experimental data (N ≥ 120) Decision accuracy, ROE adherence, scenario performance scores Between-subjects ANOVA/MANOVA; post hoc comparisons RQ2 / P2a 1 Document corpus (≥150 documents) Transfer-of-control trigger types, verification checkpoints, fallback mechanisms Axial coding; typology development RQ2 / H2b 2 ABM simulation outputs Response latency, accountability chain integrity (≥90%), protocol compliance Threshold analysis; Monte Carlo simulation 156 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 RQ2 / H2c 3 Experimental data; selfreport scales Perceived agency (MHC scale), engagement timeline compliance, trust calibration Mixed ANOVA; mediation analysis RQ3 / H3a 2 ABM outputs across 3 architectures × scenario matrix DV vector: decision quality, latency, accountability, cognitive load MANOVA; interaction effects analysis RQ3 / H3b 3 Experimental data across 3 C2 conditions Response latency, accountability traceability, documentation completeness One-way MANOVA; discriminant function analysis RQ3 / H3c 3 Experimental data; complexity × architecture interaction Performance advantage differential, discrimination accuracy Factorial ANOVA; simple effects analysis Validation 4 Tabletop exercise evaluations Framework usability, Qualitative thematic doctrinal compatibility, analysis; validation accountability traceability matrix scoring Note. RQ = Research Question; P = Proposition (qualitative); H = Hypothesis (quantitative); MHC = Meaningful Human Control; ROE = Rules of Engagement; DV = Dependent Variable; ABM = Agent-Based Model. Population, Sampling, and Units of Analysis Document Corpus and Artifact Populations The document corpus for Phase 1 qualitative analysis comprises four categories of publicly available artifacts: (a) congressional hearing transcripts from the Senate and House Armed Services Committees addressing autonomous weapons, military AI, and lethal autonomous weapons systems, spanning the 115th through 119th Congresses (2018–2026); (b) Government Accountability Office reports and testimonies evaluating Department of Defense autonomous systems programs, AI strategies, and acquisition practices; (c) Congressional Research Service reports analyzing U.S. and international policies on lethal autonomous weapon systems, military AI, and autonomous systems; and (d) published analyses, policy briefs, and research reports from the RAND Corporation, Center for a New American Security, Center for Strategic and International Studies, Stockholm International Peace Research Institute, and the Hague Centre for Strategic Studies. The compiled dataset inventory contains 33 congressional 157 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 hearing records, 11 GAO reports, 8 CRS reports, 15 think-tank publications, and 17 HRW/ICRC case study documents, supplemented by additional publicly available materials retrieved during the systematic search process to reach the target corpus of at least 150 documents sufficient for theoretical saturation (Strauss & Corbin, 1998). Weapons Systems and Technical Data Populations The weapons systems population for Phase 2 agent-based modeling is defined by the set of autonomous and semi-autonomous weapons platforms documented in publicly available sources. The SIPRI autonomous weapons dataset compiled for this study contains 20 systems across 10 countries, categorized by autonomy level (human-in-the-loop, human-on-the-loop, autonomous defensive), system type (air defense, naval, ground, loitering munition, missile), and operational status (deployed, development, fielded). The weapons performance dataset contains 12 systems with detailed specifications including human control mechanisms, response time categories, and engagement parameters drawn from CRS reports and open-source defense publications. These data provide realistic system parameters for ABM calibration, ensuring that simulated autonomous agents reflect the actual capabilities and constraints of current and nearterm weapons platforms. Sampling Logic Sampling for the document corpus follows purposive, criterion-based logic (Patton, 2015) designed to achieve maximum variation within the boundaries of the research questions. Inclusion criteria require that documents: (a) address decision authority allocation, autonomy management, human control, or C2 architecture design for autonomous or semi-autonomous weapons systems; (b) are publicly available without classification restrictions; (c) were published between 2012 (the year of the original DoD Directive 3000.09) and 2026; and (d) originate from 158 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 authoritative government, legislative, military, or recognized policy-research institutions. Exclusion criteria remove documents that: (a) address civilian-only AI applications without defense relevance; (b) consist of non-substantive procedural records (e.g., session-opening remarks without testimony); or (c) duplicate content already captured from primary sources. For Phase 3 simulation experiments, participant sampling targets a minimum of 120 participants recruited from defense-adjacent academic programs, professional military education institutions, and defense industry conferences. The sample size was calculated using G*Power (Faul et al., 2007) for a one-way MANOVA with three groups, four dependent variables, a medium effect size (f² = 0.0625, Cohen's d = 0.5), power of .80, and alpha of .05, yielding a minimum required sample of 114, rounded up to 120 to account for attrition. Participants will be stratified by military experience level (no military experience, junior military, senior military/defense professional) to enable analysis of experience as a moderating variable. For Phase 4 tabletop exercises, participants will be recruited through purposive sampling from open academic and professional defense conferences, targeting 15–25 defense professionals with operational C2 experience. Units of Analysis The unit of analysis varies by phase, reflecting the multi-level nature of the research design. In Phase 1, the primary unit of analysis is the individual document or document section, with coding units defined as discrete passages containing references to autonomy allocation, transfer of control, or accountability mechanisms. In Phase 2, the unit of analysis is the individual simulation trial, defined as a single run of the ABM under a specified configuration of C2 architecture, operational tempo, and scenario parameters. In Phase 3, the unit of analysis is the individual participant’s performance across the assigned experimental conditions, measured 159 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 through a vector of dependent variables. In Phase 4, the unit of analysis is the tabletop exercise session, with evaluation data aggregated across participant responses within each session. Data Sources and Data Collection Procedures All data sources employed in this dissertation are publicly available and require no special access, registration fees, or security clearances. This design choice ensures broad dissemination of findings while avoiding classification complications, consistent with the ethical framework established in the proposal. The following subsections describe each data source category, the retrieval and preparation procedures, and the consolidated dataset inventory. Congressional Testimony and Legislative Records Congressional hearing transcripts were systematically retrieved from Congress.gov using structured searches targeting the Senate Armed Services Committee, House Armed Services Committee, Senate Foreign Relations Committee, and relevant subcommittees. Search terms included "autonomous weapons," "lethal autonomous," "artificial intelligence military," "command and control AI," and "DoD autonomy." The compiled dataset contains 33 hearing records spanning 2018–2026, with structured fields including hearing identifier, date, congress number, committee, chamber, title, witnesses and affiliations, key topics, and source URL. Each transcript was screened for substantive testimony relevant to the research questions, and nonsubstantive procedural content was excluded from the analytic corpus. Government Accountability Office Reports Eleven GAO reports addressing DoD autonomous systems, military AI strategies, AI acquisition, and AI workforce development were retrieved from the GAO website (gao.gov). Reports were cataloged with structured fields including report number, publication date, title, key findings, recommendations, recommendation status (open, partially implemented, 160 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 implemented), and source URL. Date coverage spans 2018–2025, providing longitudinal visibility into the evolution of GAO assessments of DoD autonomous systems governance. Key reports include GAO-22-104765, which documented significant gaps in DoD AI strategies and collaboration guidance, and GAO-24-106831, which assessed AI-enabled weapon systems development and testing frameworks. Congressional Research Service Reports Eight CRS reports were compiled, covering U.S. policy on lethal autonomous weapon systems, international LAWS discussions, AI and national security, and autonomous systems issues for Congress. Reports were cataloged with fields including report identifier, title, latest version date, author, key policy issues, weapons systems discussed, autonomy classifications employed, and source URL. These reports are particularly valuable for Phase 2 model parameterization because they contain detailed descriptions of weapons system capabilities, policy constraints, and autonomy classification frameworks that directly inform agent behavioral rules and governance constraints in the ABM. SIPRI Autonomous Weapons Data The Stockholm International Peace Research Institute dataset contains 20 autonomous and semi-autonomous weapons systems across 10 countries, structured with fields for country, system name, system type, category, autonomy level, operational status, year first reported, capabilities description, notes, and source. This dataset, derived from SIPRI's 2017 "Mapping the Development of Autonomy in Weapon Systems" report and subsequent publications, provides the global context for system capabilities used to set realistic parameters in the agent-based model. Systems range from those first reported in 1980 to those reported in 2023, reflecting the full developmental trajectory of autonomous weapons technology. 161 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 DoD Directive 3000.09 Parameters Governance parameters were extracted from the January 25, 2023, update of DoD Directive 3000.09, "Autonomy in Weapon Systems," and the original 2012 directive. The extracted dataset contains 34 parameters organized by category, with fields for parameter description, category, source document reference, and approval authority required. These parameters define the legal and policy boundaries within which simulated C2 architectures must operate in the Phase 2 ABM, ensuring that the computational model respects actual governance constraints on autonomy in weapon systems. Weapons Performance Data Weapons system performance data were compiled from CRS reports, DoD publications, and open-source defense specifications for 12 autonomous and semi-autonomous weapons platforms. Structured fields include system name, service branch, country, system type, autonomy level, human control mechanism, response time category, engagement parameters, operational status, year fielded, and source. These data provide the technical parameters necessary for calibrating agent capabilities in the ABM, including realistic response times, engagement envelopes, and human-machine interface configurations. Think-Tank Publications and Case Study Data Fifteen key think-tank publications from RAND, CNAS, CSIS, and other recognized defense policy research organizations were cataloged with structured metadata including author, year, title, organization, publication type, key findings and arguments, policy recommendations, and source URL. Publication dates range from 2016 to 2025. Additionally, 17 Human Rights Watch and International Committee of the Red Cross publications were compiled, including case studies documenting ethical dilemmas, incidents involving autonomous or semi-autonomous 162 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 systems, and ethical frameworks for autonomous weapons governance. Date coverage spans 1988–2025, providing historical depth for the construction of ethically complex engagement scenarios in Phase 3. DARPA Assured Autonomy Program Data Technical parameters, objectives, and performance metrics from DARPA's Assured Autonomy program were compiled from publicly available documentation. The dataset contains 17 records with 23 structured fields covering technical areas, performer organizations, approaches, key tools, program objectives, performance metrics (baseline and improved), and methodological descriptions. These data provide the technical parameters that define realistic autonomous system behavior within the Phase 3 simulation environment and inform the verification and validation logic applied to the Phase 2 ABM. Table 3.2 Dataset Inventory and Intended Use Dataset Source Records Date Range Key Variables Phase(s) Congressional Testimony Congress.gov 33 2018–2026 Hearing ID, committee, witnesses, key topics, testimony text 1 GAO Reports GAO.gov 11 2018–2025 Report number, findings, recommendations, status 1, 2 CRS Reports CRS / FAS 8 2023–2025 Policy issues, systems discussed, autonomy classifications 1, 2 SIPRI Autonomous Weapons SIPRI 20 1980–2023 Country, system, type, autonomy level, status, capabilities 2 DoDD 3000.09 Parameters DoD 34 2012–2023 Parameter, category, approval authority, governance constraints 2 Weapons Performance CRS, DoD, OSINT 12 1980–present System specs, response times, control mechanisms, engagement parameters 2, 3 Think-Tank Publications RAND, CNAS, CSIS 15 2016–2025 Key arguments, policy recommendations, frameworks 1, 2 HRW/ICRC Cases HRW, ICRC 17 1988–2025 Ethical dilemmas, IHL compliance, incident descriptions 3 DARPA Assured Autonomy DARPA 17 2019–present Technical parameters, performance metrics, approaches 2, 3 163 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. All datasets are publicly available and compiled in the consolidated Excel workbook (Dissertation_Data_Consolidated.xlsx). OSINT = Open-Source Intelligence; FAS = Federation of American Scientists. Phase mapping details are provided in the workbook's Phase_Mapping sheet. Variables, Constructs, and Operational Definitions This section defines the core constructs and variables investigated across the four research phases, provides operational definitions specifying how each construct is measured, and maps each variable to its relevant data source, measurement instrument, and research phase. The constructs are drawn from the theoretical foundations established in Chapter 2, including the Parasuraman et al. (2000) levels of automation model, the Santoni de Sio and van den Hoven (2018) meaningful human control framework, and the trust calibration literature (Lee & See, 2004; Hoff & Bashir, 2015; Schaefer et al., 2016). Dynamic Autonomy Dynamic autonomy refers to the real-time adjustment of the degree of decision authority granted to an autonomous weapons system based on changing operational conditions, consistent with the adaptive automation concept articulated by Feigh et al. (2012) and the sliding autonomy framework of Sellner et al. (2006). Operationally, dynamic autonomy is measured as the pattern of autonomy level transitions across the engagement decision cycle (surveillance, identification, tracking, engagement, post-engagement assessment), recorded as a time-series of authority allocation states. In Phase 1, dynamic autonomy is a qualitative construct identified through coding of policy and practitioner discourse. In Phases 2 and 3, it is operationalized as the independent variable defining the autonomy allocation protocol assigned to each experimental or simulation condition. 164 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Meaningful Human Control Meaningful human control (MHC) is defined following Santoni de Sio and van den Hoven (2018) as the conjunction of two conditions: tracking (the system's behavior responds to the human operator's moral reasons and intentions) and tracing (the human operator can be held meaningfully responsible for the system's actions). Operationally, MHC is measured through a composite indicator comprising: (a) operator intervention capability (the ability to override, abort, or redirect the autonomous system at any point in the engagement cycle); (b) decision-trail completeness (the proportion of engagement decisions for which the human operator's authorization or acknowledgment is documented); and (c) perceived agency, measured through an adapted Meaningful Human Control Scale administered to experimental participants in Phase 3. Trust Calibration Trust calibration refers to the degree to which an operator's trust in the autonomous system matches the system's actual reliability and capability, following Lee and See's (2004) framework of "appropriate reliance." Operationally, trust calibration is measured through: (a) a validated self-report trust scale adapted from the Schaefer et al. (2016) meta-analysis and Jian et al.'s (2000) Trust in Automation scale; (b) behavioral indicators including decision override frequency (the rate at which operators override autonomous system recommendations) and information-seeking behavior (the frequency and depth of system-state queries before endorsing autonomous actions); and (c) trust calibration accuracy, computed as the correlation between reported trust levels and system reliability within each experimental block. 165 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Decision Quality Decision quality is defined as the degree to which engagement decisions conform to the ground-truth optimal decision as established by scenario design. In Phase 2 ABM simulations, decision quality is computed as the proportion of correct engage/do-not-engage decisions across all scenario events, where "correct" is defined by the scenario's predetermined ground truth (lawful target correctly engaged, unlawful target correctly withheld). In Phase 3, decision quality incorporates both accuracy (correct/incorrect) and appropriateness (proportionality, discrimination, necessity), scored by trained evaluators using a structured rubric derived from international humanitarian law principles (ICRC, 2021). Response Latency Response latency is defined as the elapsed time from stimulus onset (threat presentation or engagement cue) to final engagement decision (authorize, abort, or defer). In Phase 2, response latency is a model output measured in simulation time units calibrated to real-world engagement timelines derived from the weapons performance dataset. In Phase 3, response latency is measured in seconds from scenario event onset to the participant's recorded decision action. Mission-critical thresholds are defined per scenario based on the tactical context, with threshold values derived from published engagement timeline requirements in CRS and RAND analyses. Accountability Chain Integrity Accountability chain integrity is defined as the completeness and auditability of the decision trail linking each engagement action to an identifiable human authorization or acknowledgment. Operationally, accountability chain integrity is measured as the proportion of engagement decisions in a trial for which: (a) the authorizing human is identifiable; (b) the 166 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 authorization timestamp is recorded; (c) the information basis for the decision is documented; and (d) the autonomous system's recommendation and confidence level are logged. This construct operationalizes the "tracing" condition of meaningful human control (Santoni de Sio & van den Hoven, 2018) and is scored on a continuous scale from 0 (no accountability documentation) to 1 (complete accountability trail for all engagement events). Additional Constructs Three additional constructs are measured as secondary dependent variables or covariates. Operator cognitive load is measured using the NASA Task Load Index (NASA-TLX; Hart & Staveland, 1988), a validated six-subscale self-report instrument assessing mental demand, physical demand, temporal demand, performance, effort, and frustration. Rules of engagement (ROE) adherence is operationalized as the proportion of engagement decisions that comply with the scenario-specified ROE, scored by trained evaluators using a binary compliance/noncompliance rubric for each engagement event. Explainability and transparency are measured through the Explanation Satisfaction Scale (Hoffman et al., 2018), adapted for the autonomous weapons context, capturing participants’ assessment of the sufficiency, completeness, and usefulness of the system's explanations of its recommendations and actions. 167 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 3.3 Constructs, Operational Definitions, Indicators, and Measurement Strategy Construct Operational Definition Indicators Scale/Type Source/Instrument Phase(s) Dynamic Autonomy Real-time authority allocation pattern across engagement cycle Autonomy level transitions; allocation state time-series Categorical (3 levels) / Continuous (transition rate) Coding framework (P1); ABM config (P2); Experimental condition (P3) 1, 2, 3 Meaningful Human Control Tracking + tracing conditions (Santoni de Sio & van den Hoven, 2018) Intervention capability; decisiontrail completeness; perceived agency Composite: binary + proportion + Likert MHC Scale (adapted); system logs 2, 3, 4 Trust Calibration Match between operator trust and system reliability Self-report trust; override frequency; info-seeking behavior; calibration accuracy Likert (7point) + behavioral counts + correlation Trust in Automation Scale (Jian et al., 2000); behavioral logs 3 Decision Quality Conformity to ground-truth optimal engagement decision Accuracy (correct/incorrect); appropriateness (proportionality, discrimination) Proportion (0– 1); rubric scores Scenario scoring rubric; evaluator ratings 2, 3 Response Latency Time from stimulus onset to engagement decision Elapsed time (seconds/sim units) Continuous (seconds) System timestamp logs; ABM clock 2, 3 Accountability Chain Integrity Completeness of auditable decision trail Authorization ID; timestamp; info basis; AI confidence logged Proportion (0– 1) System audit logs; documentation rubric 2, 3, 4 Operator Cognitive Load Perceived workload during task performance NASA-TLX subscales: mental, physical, temporal, performance, effort, frustration Interval (0– 100 per subscale) NASA-TLX (Hart & Staveland, 1988) 3 ROE Adherence Compliance with scenario-specified rules of engagement Binary compliance per engagement event Proportion (0– 1) Evaluator rubric; scenario ROE parameters 2, 3 Explanation 3, 4 Satisfaction Scale (adapted from Hoffman et al., 2018) Note. P1 = Phase 1; P2 = Phase 2; P3 = Phase 3. MHC = Meaningful Human Control; ROE = Rules of Engagement; Explainability / Transparency Perceived quality of system explanations Sufficiency, completeness, usefulness ratings Likert (7point) ABM = Agent-Based Model; NASA-TLX = NASA Task Load Index. 168 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Instrumentation and Measures Qualitative Coding Framework The qualitative coding framework for Phase 1 document analysis is developed using the grounded theory approach of Strauss and Corbin (1998), employing three stages of coding: open coding, axial coding, and selective coding. The initial codebook is developed inductively from the first 30 documents in the corpus, with codes emerging from the data rather than being imposed a priori. Provisional code categories, informed by but not constrained to the theoretical framework, include: authority allocation patterns, transfer-of-control triggers, verification and checkpoint mechanisms, accountability structures, trust indicators, operational tempo considerations, ethical constraints, and C2 architecture preferences. The codebook is iteratively refined through constant comparison as additional documents are coded, with new codes added and existing codes merged, split, or redefined as the analysis progresses toward theoretical saturation. Inter-coder reliability will be established using two independent coders who independently code a randomly selected subset of 20% of the document corpus. Reliability will be assessed using Cohen's kappa (κ), with a target threshold of κ ≥ .80 indicating substantial agreement (Landis & Koch, 1977). Discrepancies will be resolved through discussion and consensus coding, with the codebook updated to clarify ambiguous code definitions. Analytic memos documenting the researcher's interpretive decisions, emerging theoretical insights, and methodological reflections will be maintained throughout the coding process to support the auditability and trustworthiness of the grounded theory development. 169 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Trust and Cognitive Load Measurement Trust in the autonomous system is measured using a multi-modal instrument combining three data streams, following the recommendation of the Pokorny (2026) review for moving beyond self-report measures alone. The self-report component adapts Jian et al.'s (2000) Trust in Automation scale, a 12-item instrument with established psychometric properties (Cronbach's α = .92–.96 across prior studies), supplemented by items from Schaefer et al.'s (2016) trust measurement framework targeting military-specific trust antecedents including perceived system reliability, transparency, and consequence severity. The behavioral component captures decision override frequency (number of times the operator rejects or modifies the autonomous system's recommendation per trial), information-seeking depth (number and type of system-state queries initiated by the operator before endorsing an autonomous action), and automation verification time (elapsed time spent reviewing autonomous system outputs before accepting or overriding). Cognitive load is measured using the NASA-TLX (Hart & Staveland, 1988), administered after each experimental block. Scenario Evaluation Rubrics Engagement decision quality is evaluated using a structured rubric scored by trained evaluators. The rubric assesses each engagement decision on five dimensions: (a) target identification accuracy (correct identification of the target's combatant status, scored binary); (b) proportionality assessment (appropriateness of force relative to military advantage, scored on a 5-point scale); (c) discrimination compliance (avoidance of civilian harm, scored binary with severity weighting); (d) engagement timeline appropriateness (decision made within tactically acceptable timeframe, scored binary); and (e) accountability documentation completeness (decision trail meets audit requirements, scored as proportion). Evaluator training will include 170 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 calibration exercises using pre-scored exemplar scenarios, with inter-rater reliability assessed using intraclass correlation coefficients (ICC ≥ .85 target). Autonomy Condition Manipulations The three C2 architecture conditions manipulated in Phase 3 are operationally defined as follows. In the human-in-the-loop (HITL) condition, the autonomous system presents targeting information and recommendations, but the participant must explicitly authorize every engagement decision; the system cannot engage without human approval. In the human-on-theloop (HOTL) condition, the autonomous system can initiate engagement autonomously, but the participant monitors system actions in real time and can intervene to abort or redirect at any point; a configurable time window (varying by scenario) allows the participant to review and override before the engagement is executed. In the human-over-the-loop (HOVL) condition, the participant sets strategic parameters (rules of engagement, engagement zones, target categories) before the scenario begins, and the autonomous system operates independently within those parameters; the participant receives post-action notifications and can adjust parameters between engagement cycles but does not authorize individual engagements. Manipulation checks will verify that participants correctly perceive their authority level in each condition through a postscenario manipulation check questionnaire. Phase 1: Qualitative Grounded Theory Procedures Document Corpus Development The document corpus is developed through a systematic, multi-source retrieval process designed to achieve comprehensive coverage of publicly available policy, legislative, and analytical materials relevant to dynamic autonomy management in AWS C2. The retrieval process proceeds in three stages. First, targeted searches of institutional databases (Congress.gov, 171 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 GAO.gov, DTIC Discover, RAND.org, CNAS.org, CSIS.org, SIPRI.org) using structured search queries capture the core corpus of authoritative documents. Second, backward citation tracing from key documents identified in Stage 1 expands the corpus to include referenced materials not captured by initial searches. Third, forward citation searches using Google Scholar identify subsequent publications that cite the core documents, capturing the most recent developments in the discourse. The target corpus of at least 150 documents ensures sufficient breadth for theoretical saturation, consistent with grounded theory methodological guidance (Strauss & Corbin, 1998; Charmaz, 2014). Coding Procedures Coding follows the three-stage procedure of Straussian grounded theory. Open coding involves line-by-line and paragraph-level examination of each document, generating initial codes that capture the essential meaning of each passage relevant to autonomy management, authority allocation, transfer-of-control, trust, accountability, and C2 architecture. The researcher employs in vivo coding where possible, preserving the language used by policymakers, military leaders, and analysts to describe autonomy management concepts. Axial coding organizes initial codes into categories and subcategories, identifying relationships among categories through the paradigm model: causal conditions, context, intervening conditions, action/interaction strategies, and consequences. Selective coding integrates the category structure around a core category— anticipated to be dynamic autonomy management itself—producing the grounded theory model that represents the main findings of Phase 1. Memoing and Constant Comparison Analytic memos are written throughout the coding process to document theoretical ideas, emerging relationships among codes, methodological decisions, and reflexive observations. 172 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Memos serve as the primary vehicle for theory development, recording the researcher's evolving understanding of the data and the reasoning behind coding decisions. Constant comparison—the iterative process of comparing each new data segment with previously coded data, each code with other codes, and each category with other categories—is applied throughout all three coding stages. This process ensures that the developing grounded theory is densely grounded in the data and that theoretical categories are internally consistent, mutually exclusive where appropriate, and collectively exhaustive of the phenomena observed in the data. Trustworthiness Procedures Phase 1 trustworthiness is ensured through multiple strategies aligned with Lincoln and Guba's (1985) criteria. Credibility is supported by prolonged engagement with the data, triangulation across multiple document types (congressional testimony, audit reports, policy analyses, case studies), and peer debriefing with a colleague experienced in qualitative military research. Dependability is established through a detailed audit trail documenting all coding decisions, codebook revisions, and analytic memos, and through inter-coder reliability assessment (κ ≥ .80). Confirmability is addressed through reflexive memoing and the maintenance of a decision log recording how preconceptions were identified and bracketed. Transferability is supported by thick description of the data, context, and analytic procedures sufficient for readers to assess the applicability of findings to their contexts. Phase 1 Outputs and Phase 2 Connection Phase 1 produces three primary outputs: (a) a grounded theory model specifying the key constructs, relationships, and boundary conditions of dynamic autonomy management as articulated in the policy and practitioner literature; (b) a validated codebook suitable for application in subsequent qualitative analyses; and (c) a set of theoretical propositions linking 173 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 contextual factors to optimal autonomy allocation strategies. These outputs directly inform Phase 2 by providing the construct definitions, relationship specifications, and parameter values needed to design the agent-based computational model. Specifically, the grounded theory categories become agent attributes, the identified relationships become agent interaction rules, and the boundary conditions become model constraints. Phase 2: Agent-Based Computational Modeling Procedures Model Design and Parameterization The agent-based model (ABM) translates the grounded theory findings from Phase 1 into a computational model capable of simulating dynamic authority allocation across three C2 architectures under systematically varied conditions. The model is developed using the ODD (Overview, Design concepts, Details) protocol (Grimm et al., 2010), the standard documentation framework for ABMs in the scientific community, ensuring transparency and reproducibility. Model parameters are derived from three primary sources: (a) the grounded theory model from Phase 1, which specifies the qualitative relationships among constructs; (b) the DoDD 3000.09 governance parameters dataset, which defines the legal constraints on autonomy allocation; and (c) the weapons performance and SIPRI datasets, which provide realistic system capability parameters including response times, engagement envelopes, and autonomy levels. Agent Specifications The ABM includes four primary agent types. Human Commander agents represent human decision-makers at different echelons of the C2 hierarchy, with attributes including decision latency (calibrated from published C2 timeline data), trust level (initialized from Phase 1 findings and updated dynamically), cognitive load capacity (based on NASA-TLX research norms), and risk tolerance (varied parametrically across simulations). Autonomous Weapons 174 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 System agents represent individual AWS platforms, with attributes including sensor accuracy, targeting reliability, engagement speed, and autonomy level (configured per C2 architecture condition). Target agents represent entities in the operational environment, with attributes including combatant status (lawful target, civilian, ambiguous), threat level, and behavior patterns. Environment agents represent the operational context, generating scenario events including threat presentations, rules-of-engagement changes, communications disruptions, and ethical dilemma triggers at rates determined by the operational tempo parameter. C2 Architecture Implementation The three C2 architectures are implemented as configurable interaction protocols governing the communication and authority relationships between Human Commander and AWS agents. In the HITL configuration, AWS agents transmit targeting recommendations to the Human Commander agent, which processes the information (with latency and accuracy parameters reflecting human cognitive constraints) and returns an authorization or rejection decision. In the HOTL configuration, AWS agents initiate engagement autonomously and simultaneously notify the Human Commander agent, which has a defined intervention window to override the action. In the HOVL configuration, the Human Commander agent sets strategic parameters at the beginning of each scenario phase, and AWS agents operate autonomously within those parameters, with the Human Commander receiving periodic status reports and postaction notifications. Scenario Matrix The simulation scenario matrix varies three independent factors: C2 architecture type (HITL, HOTL, HOVL), operational tempo (low, medium, high, defined by event presentation rate), and scenario complexity (low—clear targets in permissive environment; medium—mixed 175 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 combatant/civilian with moderate ambiguity; high—complex ethical dilemmas with high civilian presence and degraded communications). This 3 × 3 × 3 factorial design produces 27 unique conditions, each run for a minimum of 1,000 Monte Carlo replications to ensure stable estimates of the dependent variable distributions. The full simulation campaign comprises a minimum of 27,000 simulation trials. Model Calibration, Sensitivity Analysis, and Validation Model calibration proceeds through a three-stage process. First, face validity is assessed by presenting the model's structure and behavioral rules to subject matter experts recruited from the Phase 1 document corpus authors and defense policy professionals, soliciting feedback on the model's fidelity to real-world C2 dynamics. Second, parameter sensitivity analysis identifies the model parameters with the greatest influence on dependent variable outputs, using Latin Hypercube Sampling to efficiently explore the multidimensional parameter space and Sobol indices to quantify parameter sensitivity (Saltelli et al., 2008). Third, pattern-oriented validation (Grimm et al., 2005) compares the ABM's emergent behavioral patterns against documented patterns in the qualitative data from Phase 1, assessing whether the computational model reproduces the dynamics described in policy and practitioner accounts. 176 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 3.4 Agent-Based Model Entities, Rules, and Outputs Entity/Component Attributes Behavioral Rules Key Outputs Human Commander Agent Decision latency; trust level; cognitive load capacity; risk tolerance; experience level Receives targeting data; processes with latency/accuracy; authorizes/rejects (HITL); monitors/overrides (HOTL); sets parameters (HOVL); trust updates based on outcomes Authorization decisions; override actions; cognitive load state; trust trajectory AWS Agent Sensor accuracy; targeting reliability; engagement speed; autonomy level; confidence threshold Detects/classifies targets; generates engagement recommendation with confidence score; executes per C2 protocol; logs decision trail Engagement decisions; confidence scores; decision quality; response latency Target Agent Combatant status; threat level; behavior pattern; visibility; proximity to civilians Moves through environment; presents engagement cues; varies behavior by scenario complexity Engagement events; ground-truth status for scoring Environment Agent Operational tempo; comms reliability; ROE parameters; ethical dilemma triggers Generates events at tempo-defined rates; introduces comms disruptions; triggers ROE changes; presents ethical dilemma scenarios Scenario events; tempo metrics; disruption events Transition count; transition latency; fallback activations; accountability integrity Note. HITL = Human-in-the-Loop; HOTL = Human-on-the-Loop; HOVL = Human-over-the-Loop; ROE = Rules of Transfer-of-Control Protocol Trigger type (time, event, confidence, operator); verification checkpoint count; fallback mechanism Evaluates trigger conditions; initiates autonomy transition; executes verification sequence; activates fallback if transition fails Engagement; AWS = Autonomous Weapons System. Agent attributes are parameterized from Phase 1 findings, DoDD 3000.09 parameters, and weapons performance data. Phase 3: Simulation-Based Experimental Procedures Scenario Design The experimental scenarios are designed to present participants with realistic engagement decision-making challenges that vary systematically in complexity and time pressure. Scenario content is derived from three sources: (a) publicly available wargame scenarios from Naval War College and Army War College publications, which provide the tactical context and force compositions; (b) CSIS wargame reports documenting AI and autonomy scenarios, which provide validated scenario structures; and (c) HRW and ICRC case studies documenting ethical 177 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 dilemmas in autonomous weapons employment, which provide the ethically complex engagement challenges. Each scenario presents a sequence of 10–15 engagement events requiring target identification, engagement authorization, and post-engagement assessment decisions, with the mix of lawful targets, protected persons, and ambiguous entities varied by scenario complexity level. Experimental Design and Conditions Phase 3 employs a 3 × 3 between-subjects/within-subjects mixed factorial design. The between-subjects factor is C2 architecture (HITL, HOTL, HOVL), with participants randomly assigned to one of the three architecture conditions. The within-subjects factor is scenario complexity (low, medium, high), with each participant completing all three complexity levels in counterbalanced order to control for order effects. Participants receive standardized training on their assigned C2 architecture, including a practice scenario, before beginning the experimental trials. The simulation environment presents engagement events through a purpose-built interface displaying a tactical map, sensor feeds, AI-generated targeting recommendations with confidence scores, and rules of engagement parameters. Participant Assignment and Randomization Participants (N ≥ 120) are randomly assigned to C2 architecture conditions using stratified random assignment, with stratification by military experience level to ensure balanced representation across conditions. Randomization is implemented using computer-generated random assignment sequences. Manipulation checks are administered after the first experimental trial to verify that participants correctly understand and perceive their authority level. Participants who fail the manipulation check receive additional training and are re-checked 178 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 before proceeding; persistent failures are documented but retained in an intent-to-treat analysis framework. Pilot Testing A pilot study with 15–20 participants (5–7 per condition) will be conducted prior to the main experiment to: (a) assess scenario difficulty calibration, ensuring that the three complexity levels produce the intended variation in decision difficulty without floor or ceiling effects; (b) evaluate measurement instrument reliability using Cronbach's alpha for the self-report scales and inter-rater reliability for the evaluator rubrics; (c) estimate effect sizes for power analysis refinement; and (d) identify and resolve usability issues with the simulation interface. Pilot data will be analyzed descriptively, and the experimental protocol will be revised as needed before the main data collection. Data Capture and Logging The simulation environment captures all participant actions and system events in a timestamped log, including: engagement decisions (authorize, abort, defer) with timestamps; system-state queries initiated by the participant; AI recommendation acceptance/override events; rules-of-engagement consultation events; communication events; and post-engagement assessment entries. Self-report instruments (trust scale, NASA-TLX, manipulation check, explanation satisfaction) are administered electronically at specified points during and after each experimental block. All data are stored in a structured database with participant identifiers pseudonymized to protect confidentiality. 179 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 3.5 Experimental Conditions and Outcome Measures Factor Levels Description Measured Outcomes C2 Architecture (between-subjects) HITL HOTL HOVL Human-in-the-loop: explicit authorization required Human-on-the-loop: monitor and override Human-over-the-loop: set parameters, system acts autonomously Decision quality Response latency Accountability chain integrity ROE adherence Scenario Complexity (within-subjects) Low Medium High Low: clear targets, permissive environment Medium: mixed combatant/civilian, moderate ambiguity High: ethical dilemmas, high civilian presence, degraded comms Decision accuracy Proportionality scores Discrimination compliance Timeline adherence Dependent Variables — Decision quality, response latency, accountability chain integrity, ROE adherence, trust calibration, cognitive load, perceived agency Continuous and proportion scales; self-report Likert; behavioral counts Covariates — Military experience level; prior Demographic questionnaire; automation experience; propensity to trust pre-experiment scales (dispositional); technology self-efficacy Note. HITL = Human-in-the-Loop; HOTL = Human-on-the-Loop; HOVL = Human-over-the-Loop; ROE = Rules of Engagement. Minimum N = 120 (40 per architecture condition). Scenario order counterbalanced using a Latin square design. Phase 4: Tabletop Exercise Validation Procedures Tabletop Exercise Design Phase 4 validates the dynamic autonomy management framework developed through Phases 1–3 using structured tabletop exercises (TTXs) following the methodology established by the RAND Corporation and CSIS for defense policy analysis. The TTX design translates the empirically derived framework into operationally meaningful scenarios that defense professionals can evaluate against their operational experience and doctrinal knowledge. The exercise is structured in three segments: (a) a familiarization briefing presenting the dynamic autonomy framework, its empirical basis, and its proposed transfer-of-control protocols; (b) a scenario-based exercise in which participants apply the framework to a series of escalating 180 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 operational vignettes; and (c) a structured debrief and evaluation session in which participants assess the framework against validation criteria. Participant Roles and Recruitment Tabletop exercise participants (target N = 15–25 per session, 2–3 sessions) are recruited through purposive sampling from open academic conferences (e.g., Association of the United States Army Annual Meeting, Naval War College conferences, CSIS defense forums) and professional defense networks. Participants are assigned to roles reflecting the C2 hierarchy: Operational Commander (sets strategic parameters, authorizes ROE), Tactical Controller (manages engagement authorities within operational parameters), AWS Operator (interfaces directly with autonomous system representations), Legal Advisor (assesses IHL compliance), and Observer/Evaluator (documents the exercise process and outcomes without participating in decisions). Role assignments are made based on participants' professional background and experience level to ensure realistic role performance. Injects and Facilitation The TTX employs a series of pre-planned injects—scenario events introduced by the facilitation team at predetermined points to drive the exercise narrative and test specific aspects of the dynamic autonomy framework. Injects include: initial threat detection requiring autonomy level determination, escalating threat scenarios triggering transfer-of-control events, communications degradation forcing autonomy adjustment, ethically ambiguous targeting scenarios testing discrimination protocols, post-engagement assessment events requiring accountability review, and framework failure scenarios testing fallback mechanisms. The facilitator guides discussion using standardized prompts, ensures all framework elements are 181 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 exercised, and manages the exercise timeline. A designated recorder captures all decisions, discussions, and evaluation comments. Evaluation Forms and Validation Criteria Participants complete structured evaluation forms assessing the dynamic autonomy framework against five validation criteria: (a) operational feasibility—whether the framework's transfer-of-control protocols can be executed within realistic operational timelines; (b) doctrinal compatibility—whether the framework aligns with existing joint doctrine and can be incorporated into doctrinal updates; (c) accountability traceability—whether the framework maintains an adequate decision trail across autonomy transitions; (d) meaningful human control preservation—whether the framework provides sufficient human oversight to satisfy legal and ethical requirements; and (e) scalability—whether the framework can accommodate different operational scales and coalition configurations. Each criterion is rated on a 5-point scale (1 = not met, 5 = fully met) with open-ended justification required for ratings below 3. 182 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 3.6 Tabletop Exercise Validation Matrix Validation Criterion Assessment Method Data Source Acceptance Threshold Remediation if Unmet Operational Feasibility Timed execution of transfer-of-control protocols during vignettes Facilitator timing records; participant ratings Mean rating ≥ 3.5; protocols executable within scenario timelines Simplify protocol steps; adjust timing parameters Doctrinal Compatibility Expert assessment against JP 3-0, JP 360, and servicespecific doctrine Participant evaluation forms; legal advisor assessment Mean rating ≥ 3.5; no irreconcilable doctrinal conflicts Revise framework to align with doctrinal requirements Accountability Traceability Decision trail audit across autonomy transitions during exercise Exercise recorder logs; audit trail completeness check Mean rating ≥ 4.0; ≥90% decision trail completeness Enhance logging requirements; add verification steps Meaningful Human Control Assessment of human override capability and agency preservation Participant ratings; exercise observations Mean rating ≥ 4.0; no scenarios where human override was prevented Redesign override mechanisms; strengthen fallback protocols Add echelon-specific Assessment of Participant ratings; Mean rating ≥ 3.0; parameters; develop framework adaptable framework structured debrief coalition modules to 2+ echelons applicability across discussion echelons and coalition configs Note. JP = Joint Publication. Acceptance thresholds represent minimum standards for framework validation. Criteria Scalability not meeting thresholds trigger iterative framework refinement and re-evaluation in subsequent TTX sessions. Data Analysis Plan Qualitative Analysis: Phase 1 and Phase 4 Qualitative data from Phase 1 (document corpus) and Phase 4 (tabletop exercise debriefs and open-ended evaluation responses) are analyzed using grounded theory procedures as described in Section 3.8. Phase 1 analysis produces the core grounded theory model through open, axial, and selective coding, with theoretical saturation as the termination criterion. Phase 4 qualitative data are analyzed using directed content analysis (Hsieh & Shannon, 2005), with codes derived from the grounded theory model and the validation criteria. The Phase 4 analysis specifically examines whether practitioners' experiential evaluations of the framework align 183 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 with, extend, or contradict the theoretical propositions generated in Phase 1, providing a form of member-checking at the framework level. Quantitative Analysis: Phases 2 and 3 Quantitative data from Phases 2 and 3 are analyzed using a hierarchical strategy proceeding from descriptive statistics through inferential testing to multivariate modeling. Descriptive statistics (means, standard deviations, distributions, correlations) are computed for all dependent variables within each condition to characterize the data and check distributional assumptions. Reliability checks include Cronbach's alpha for all multi-item self-report scales and inter-rater reliability (ICC) for evaluator-scored rubrics. The primary inferential analysis employs multivariate analysis of variance (MANOVA) to test the effects of C2 architecture type on the dependent variable vector (decision quality, response latency, accountability chain integrity, operator cognitive load, trust calibration). MANOVA is appropriate because the dependent variables are theoretically related and correlated, and testing them simultaneously protects against inflation of familywise error rate while capturing the multivariate pattern of effects. Significant MANOVA results are followed by univariate ANOVAs for each dependent variable and post hoc pairwise comparisons using Tukey's HSD. The interaction between C2 architecture and scenario complexity is tested using a mixed-model ANOVA with architecture as the between-subjects factor and complexity as the within-subjects factor. Where the data support more complex modeling, structural equation modeling (SEM) or path analysis will be employed to test the theorized causal relationships among autonomy level, trust calibration, cognitive load, and decision quality. SEM provides the capability to simultaneously model measurement error and test the structural relationships among latent 184 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 constructs, offering a more rigorous test of the theoretical framework than ANOVA alone. Model fit will be assessed using standard fit indices: χ²/df ratio, Comparative Fit Index (CFI ≥ .95), Root Mean Square Error of Approximation (RMSEA ≤ .06), and Standardized Root Mean Square Residual (SRMR ≤ .08), following Hu and Bentler's (1999) recommendations. For Phase 2 ABM data, analysis includes sensitivity analysis using Sobol indices to identify the model parameters with the greatest influence on dependent variable outputs, Monte Carlo simulation to estimate the distributions of dependent variables under each condition, and analysis of variance across the 27-condition scenario matrix. ABM results are visualized through response surface plots showing the interaction of operational tempo, scenario complexity, and C2 architecture on each dependent variable. Mixed-Methods Integration The integration of qualitative and quantitative findings follows the convergent mixedmethods integration strategy described by Creswell and Plano Clark (2018). Integration occurs through three mechanisms. First, a joint display matrix presents qualitative findings (Phase 1 grounded theory categories and propositions; Phase 4 evaluation themes) alongside quantitative findings (Phase 2 simulation results; Phase 3 experimental results) for each research question, enabling systematic comparison and triangulation. Second, meta-inferences are drawn by assessing the degree of convergence, complementarity, or divergence between qualitative and quantitative findings for each construct. Third, the final dynamic autonomy management framework is constructed by integrating evidence across all four phases, with qualitative evidence providing explanatory depth and contextual richness, and quantitative evidence providing precision and generalizability. 185 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Reliability, Validity, Trustworthiness, and Rigor Qualitative Trustworthiness The trustworthiness of qualitative components (Phases 1 and 4) is established through the four criteria of Lincoln and Guba (1985). Credibility is supported by triangulation across four document source categories, prolonged engagement with the data, peer debriefing, and memberchecking through the Phase 4 tabletop exercises, in which defense practitioners evaluate the grounded theory model's fidelity to their operational experience. Dependability is established through a comprehensive audit trail documenting all methodological decisions, codebook versions, and analytic memos, and through inter-coder reliability assessment with a target kappa of .80 or above. Confirmability is addressed through reflexive memoing, negative case analysis (actively seeking data that contradicts emerging theoretical claims), and the decision log. Transferability is supported by detailed description of the research context, participant characteristics, and analytic procedures. Quantitative Validity Internal validity in Phase 3 is supported by random assignment to C2 architecture conditions, counterbalancing of scenario complexity order, standardized experimental protocols, manipulation checks, and control for identified covariates (military experience, prior automation experience, dispositional trust). Threats to internal validity—including maturation, testing effects, and demand characteristics—are mitigated through counterbalanced design, cover stories that obscure the specific hypotheses being tested, and post-experiment suspicion checks. External validity is supported by the use of realistic scenarios derived from publicly available military wargame materials rather than abstract laboratory tasks, the inclusion of participants with varying levels of military experience, and the validation of experimental 186 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 findings through the Phase 4 tabletop exercises with operational defense professionals. However, the controlled experimental setting represents a limitation on ecological validity that is explicitly acknowledged in Section 3.15. Construct validity is established through: (a) the use of validated, psychometrically established measurement instruments (Jian et al., 2000; Hart & Staveland, 1988; Schaefer et al., 2016) with documented convergent and discriminant validity; (b) the multi-modal measurement approach that triangulates self-report, behavioral, and system-logged indicators for key constructs; and (c) pilot testing to confirm that the measurement instruments perform as expected in the autonomous weapons decision-making context. Statistical conclusion validity is supported by adequate sample size (determined through a priori power analysis), appropriate statistical tests for the data structure, and reporting of effect sizes and confidence intervals alongside significance tests. Model Validity for Agent-Based Modeling The validity of the Phase 2 ABM is established through three complementary approaches. Face validity is assessed through expert review of the model structure, behavioral rules, and parameter values, ensuring that the model is judged by knowledgeable practitioners to represent a plausible approximation of real-world C2 dynamics. Structural validity is assessed through sensitivity analysis, confirming that the model's behavior is responsive to parameter changes in theoretically expected directions and that the results are robust to reasonable parameter perturbations. Pattern-oriented validation (Grimm et al., 2005) compares emergent model patterns against documented empirical patterns from the Phase 1 qualitative data, assessing whether the model reproduces the dynamic behaviors described in the policy and practitioner 187 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 literature. Cross-validation with Phase 3 experimental results provides the strongest test of model validity, assessing whether ABM predictions are confirmed by human experimental data. Scenario Validity for Simulations and Tabletop Exercises The validity of experimental and tabletop scenarios is established through: (a) derivation from authoritative sources (war college wargames, CSIS reports, HRW/ICRC case studies) rather than researcher-generated scenarios; (b) review by subject matter experts during pilot testing; (c) manipulation checks confirming that participants perceive scenario complexity and autonomy levels as intended; and (d) iterative refinement based on pilot data and expert feedback. Tabletop exercise scenarios are additionally validated by assessing their alignment with Joint Publication planning factors and operational doctrine. Ethical Considerations This research raises several ethical considerations that are addressed through rigorous protocols designed to ensure responsible conduct while enabling the generation of knowledge directly relevant to national security decision-making. Use of Public and Unclassified Data Classification concerns are mitigated by the exclusive use of publicly available, unclassified data sources throughout all phases of the research. No classified information, systems, or scenarios are used at any stage. All Department of Defense documents referenced are publicly released through official channels (Congress.gov, GAO.gov, defense.gov). Simulation scenarios are constructed from open-source materials. This design choice ensures broad dissemination of findings while avoiding security complications that could restrict the research's policy impact. 188 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Data Security and Confidentiality All experimental data are pseudonymized using randomly generated participant identifiers, with the linking key stored separately from the research data on an encrypted, accesscontrolled device. Simulation data are stored on encrypted systems compliant with institutional data security requirements. Participant demographics are reported only in aggregate to prevent identification of specific military units or installations. Data retention follows institutional and federal requirements, with identifiable data destroyed after the mandated retention period. Dual-Use Concerns and Responsible Research The ethical sensitivity of autonomous weapons research demands reflexive engagement with the normative implications of the research design. The researcher acknowledges that dynamic autonomy management frameworks can either strengthen or weaken human oversight depending on implementation. The research explicitly incorporates meaningful human control as a non-negotiable constraint rather than a variable to be optimized away, consistent with the ethical principles established by the Defense Innovation Board (2019) and the ICRC (2021). The framework is designed to enhance, not diminish, human oversight of autonomous weapons employment. Research outputs will be reviewed for potential dual-use implications prior to publication, and the researcher will engage with appropriate institutional review processes for research with national security implications. Limitations and Delimitations of the Methodology Methodological Limitations Several methodological limitations constrain the interpretation and generalizability of this research. First, the exclusive reliance on publicly available, unclassified data means that the research cannot incorporate classified operational data, after-action reports, or system 189 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 performance specifications that would enhance the fidelity of the computational model and experimental scenarios. This limitation is an inherent consequence of the decision to prioritize broad dissemination and avoid classification complications, but it means that the resulting framework will require further validation with classified data before operational adoption. Second, the Phase 3 simulation environment, while designed for maximum realism using publicly available military wargame materials, cannot fully replicate the stress, consequences, and organizational dynamics of actual combat operations. The ecological validity of laboratorybased weapons employment research is inherently limited, as noted by the Pokorny (2026) systematic review. Participants in the simulation experiments are aware that their decisions have no real consequences, which may affect their risk calculation, trust calibration, and decisionmaking strategies in ways that differ from operational behavior. Third, the sample for Phase 3 comprises defense-adjacent professionals and students from professional military education institutions rather than active-duty personnel making real weapons employment decisions. While this sample provides relevant domain expertise, the lack of active operational context may limit the transferability of findings to front-line C2 environments. Fourth, the agent-based model, while parameterized from empirical data, necessarily simplifies the complexity of real-world C2 dynamics and may not capture all emergent behaviors of operational human-AI teams. Delimitations Several boundaries are intentionally imposed on the study. The research focuses exclusively on the dynamic allocation of decision authority in the engagement decision cycle (surveillance through post-engagement assessment) and does not address broader C2 functions such as logistics, intelligence analysis, or force management, which involve different decision 190 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 dynamics and trust requirements. The study examines three specific C2 architectures (HITL, HOTL, HOVL) as they are defined in the current literature and does not investigate hybrid or novel architectures beyond these established paradigms. The temporal scope of the document corpus (2012–2026) captures the period since the original DoD Directive 3000.09 but does not extend to earlier autonomous weapons discourse. Finally, the study addresses U.S. and allied perspectives on autonomous weapons governance; while the SIPRI data provide global context, the framework is designed primarily for application within U.S. military C2 structures and may require adaptation for other national contexts. Chapter Summary This chapter has presented the comprehensive methodology for investigating dynamic autonomy management in human-AI command and control for autonomous weapons systems. The four-phase sequential mixed-methods design integrates qualitative grounded theory development, agent-based computational modeling, simulation-based experimentation, and tabletop exercise validation into a coherent research program that moves from exploratory discovery through computational testing and experimental verification to operational validation. Each phase builds upon its predecessor, with the qualitative findings parameterizing the computational model, the model predictions generating experimental hypotheses, and the experimental results informing the validation exercises. The methodology addresses the critical gaps identified in the Pokorny (2026) systematic literature review and the comprehensive literature review presented in Chapter 2, including the absence of empirically validated dynamic autonomy frameworks (C2-3), the lack of comparative C2 architecture assessments (C2-5), the need for validated transfer-of-control protocols (AWS3), and the call for multi-modal measurement moving beyond self-report metrics alone (TT-6). 191 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 All data sources are publicly available and unclassified, ensuring broad dissemination of findings while maintaining the scholarly rigor necessary for a doctoral dissertation in military science and technology. The research is designed to produce findings directly actionable for the Joint Chiefs of Staff and senior leaders of the joint military industrial base. The grounded theory model will provide the conceptual foundation for doctrinal updates, the computational model will enable scenario-based policy analysis, the experimental results will establish empirical benchmarks for C2 architecture performance, and the validated framework will offer a ready-to-adopt governance tool for autonomous weapons employment. Chapter 4 presents the results of Phase 1, the qualitative grounded theory analysis of the compiled document corpus. 192 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 CHAPTER 4: RESULTS Introduction This chapter presents the results of the four-phase sequential mixed-methods investigation into dynamic autonomy management in human-AI command and control for autonomous weapons systems. The findings are organized sequentially by research phase, beginning with the qualitative grounded theory analysis of the policy and doctrinal document corpus (Phase 1), followed by the agent-based modeling simulation results (Phase 2), the simulation-based experimental findings (Phase 3), and the expert tabletop exercise validation results (Phase 4). The chapter concludes with a cross-phase integration section that synthesizes convergent and divergent findings across all four methodological approaches, mapping the cumulative evidence to each of the three research questions guiding this dissertation. Three research questions guided this investigation. RQ1 asked: How should decision authority be dynamically allocated between human commanders and autonomous weapons AI across different operational phases (surveillance, identification, tracking, engagement, and postengagement assessment)? RQ2 asked: What transfer-of-control protocols preserve meaningful human agency without degrading operational tempo below mission-critical thresholds? RQ3 asked: How do different C2 architectures (human-in-the-loop, human-on-the-loop, human-overthe-loop) affect both operational effectiveness and accountability traceability in autonomous weapons employment? The results presented herein reveal a consistent pattern across all four phases: the fundamental tension between operational tempo and accountability integrity constitutes the central design constraint for dynamic autonomy management frameworks. Phase 1 qualitative analysis identified Autonomy Governance as the core category with the highest centrality score 193 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 among eight emergent categories, with AI/ML Capabilities appearing as the most frequently coded theme across the 84-document corpus (48.8% of documents). Phase 2 agent-based modeling quantified the speed-accountability tradeoff, revealing that human-in-the-loop (HITL) architectures maintained 97.8% accountability chain integrity but with mean response latencies of 8.51 seconds, while human-over-the-loop (HOVL) architectures achieved 1.20-second response times at the cost of reduced accountability integrity (68.2%). Phase 3 experimental analysis confirmed these patterns with human factors data, demonstrating large effects of autonomy level on response time (η²p = .73) and significant interactions between autonomy level and threat tempo on cognitive load (η²p = .16). Phase 4 expert validation rated the resulting Dynamic Autonomy Management framework significantly above neutral across all five evaluation criteria, with Decision Traceability receiving the highest mean rating (M = 5.83, SD = 0.62) and Scalability rated lowest (M = 4.72, SD = 1.18). Each phase section below presents descriptive results, inferential analyses, relevant tables and figures, and interpretive summaries. All statistical results are reported in accordance with the American Psychological Association 7th edition reporting standards. Effect sizes are reported alongside significance tests to facilitate practical interpretation of findings. Tables present actual computed values from the analysis pipeline, and figures reproduce the visualizations generated during the analytical process. The chapter is organized to mirror the sequential logic of the research design. Phase 1 results establish the qualitative foundation by identifying the thematic landscape and emergent theoretical framework from the document corpus. Phase 2 results translate these qualitative insights into computational models, quantifying the performance tradeoffs inherent in different C2 architectures. Phase 3 results introduce the human factors dimension, examining how 194 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 autonomy level and threat tempo interact to affect decision accuracy, response time, trust, cognitive load, and rules of engagement compliance. Phase 4 results provide the validational capstone, presenting expert evaluations of the integrated Dynamic Autonomy Management framework. Finally, the cross-phase integration section demonstrates how findings converge across methodological boundaries, strengthening the evidentiary base through triangulation. Each section identifies key findings that are carried forward to the discussion in Chapter 5. Phase 1: Qualitative Grounded Theory Results Document Corpus Description The Phase 1 qualitative analysis employed a constructivist grounded theory approach to analyze a purposively sampled corpus of 84 policy, doctrinal, legal, and analytical documents related to autonomous weapons systems, human-AI command and control, and meaningful human control. The corpus was assembled from five primary source categories spanning the period from 2012 to 2026, representing the most active period of institutional discourse on autonomous weapons governance. The document corpus comprised five source categories: Congressional testimony (n = 33, 39.3%), Government Accountability Office (GAO) reports (n = 11, 13.1%), Congressional Research Service (CRS) reports (n = 8, 9.5%), think tank publications from organizations including CNAS, RAND, and Brookings (n = 15, 17.9%), and Human Rights Watch/International Committee of the Red Cross cases and position papers (n = 17, 20.2%). This distribution ensured representation across governmental, military, analytical, and advocacy perspectives on autonomous weapons governance. The coding process yielded a total of 19 unique thematic codes organized into eight major categories. Across the 84 documents, a mean of 2.57 codes per document was applied 195 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 (Mdn = 2.0), with a maximum of 7 codes assigned to a single document. Four documents received no codes, as their content fell outside the scope of the analytical framework. The coding achieved theoretical saturation after approximately 65 documents, with subsequent documents reinforcing existing categories without generating new codes. Open Coding Results The open coding phase of the grounded theory analysis produced 19 thematic codes through line-by-line and incident-by-incident coding of the document corpus. Table 1 presents the complete theme frequency distribution, including absolute frequencies, percentages of the total corpus, and breakdowns by document source category. The theme frequencies exhibited a characteristic long-tail distribution, with a small number of highly prevalent themes and a larger number of less frequently occurring codes. AI/ML Capabilities emerged as the most frequently coded theme, appearing in 41 of 84 documents (48.8%). This theme captured discussions of specific AI and machine learning technologies, capabilities, and limitations as they pertain to autonomous weapons systems. The prevalence of this theme reflects the dominant focus of institutional discourse on the technological dimensions of autonomy. Notably, this theme was most prevalent in CRS reports (75.0% of CRS documents) and GAO reports (63.6%), indicating that governmental analytical bodies devoted substantial attention to technical capabilities assessment. Oversight Mechanisms was the second most frequent theme (n = 32, 38.1%), appearing most consistently in GAO reports (100.0% of GAO documents), reflecting that agency's institutional mandate to evaluate government oversight processes. Accountability Chain (n = 19, 22.6%) and Moral/Ethical Concerns (n = 19, 22.6%) were tied as the third most frequent codes. The ethical concerns theme showed a distinctive source pattern, appearing in 58.8% of 196 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 HRW/ICRC documents but only 12.1% of Congressional testimony, suggesting a divergence between advocacy and legislative perspectives on the ethical dimensions of autonomous weapons. Table 4.1 Theme Frequency Distribution Across Document Corpus (N = 84) Theme Category n % Congressional Testimony GAO Reports CRS Reports Think Tank HRW/ ICRC AI/ML Capabilities Oversight Mechanisms Technology Autonomy Governance Accountability Ethics Accountability 41 32 48.8 38.1 17 8 7 11 6 4 8 7 3 2 19 19 17 22.6 22.6 20.2 6 4 4 2 0 0 2 3 3 5 2 1 4 10 9 Meaningful Human Control Autonomy Governance Ethics Technology Autonomy Governance Transfer of Control Trust Trust Transfer of Control Accountability Decision Authority Meaningful Human Control Decision Authority Meaningful Human Control 16 19.0 3 0 1 4 8 15 17.9 5 2 3 2 3 13 11 10 15.5 13.1 11.9 5 6 1 1 1 0 2 1 3 4 0 0 1 3 6 6 7.1 1 1 2 1 1 5 3 2 6.0 3.6 2.4 1 0 0 1 0 0 0 0 0 2 1 2 1 2 0 2 2 2.4 2.4 0 0 0 0 0 1 0 0 2 1 2 2.4 1 0 0 1 0 1 1.2 0 0 0 1 0 0 0.0 0 0 0 0 0 Accountability Chain Moral/Ethical Concerns Legal Accountability Framework Meaningful Human Control DoDD 3000.09 Governance Arms Race/Proliferation Technical Risk/Failure Rules of Engagement Transfer-of-Control Triggers Trust Calibration Explainability/Transparency Transfer Conditions/Criteria Accountability Gap Decision Authority Allocation MHC Definition/Standards Dynamic Autonomy MHC Erosion Risks The Legal Accountability Framework theme (n = 17, 20.2%) showed particularly strong representation in HRW/ICRC documents (52.9%), reflecting these organizations' focus on international humanitarian law compliance. Meaningful Human Control (n = 16, 19.0%) similarly showed disproportionate representation in advocacy documents (47.1% of HRW/ICRC cases), aligning with these organizations' advocacy for binding regulations on autonomous 197 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 weapons. DoDD 3000.09 Governance (n = 15, 17.9%) was distributed more evenly across sources, appearing in all five source categories, reflecting the foundational role of this Department of Defense directive in autonomous weapons policy discourse. At the lower end of the frequency distribution, several themes appeared infrequently but carried theoretical significance. Transfer-of-Control Triggers (n = 6, 7.1%) and Transfer Conditions/Criteria (n = 2, 2.4%) were relatively rare in the existing document corpus, suggesting that operational mechanisms for autonomy transitions receive comparatively less attention than governance principles in current institutional discourse. Similarly, Dynamic Autonomy appeared in only one document (1.2%), indicating that the concept of contextdependent autonomy adjustment—central to this dissertation's research questions—represents a genuine gap in the existing literature. Figure 4.1 Theme Frequency Distribution Across the 84-Document Corpus, Color-Coded by Category Figure 4.1 presents the horizontal bar chart of theme frequencies, color-coded by major category. The visualization reveals the dominance of Technology and Autonomy Governance 198 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 categories in the corpus, with Ethics and Accountability themes forming a secondary cluster. The relatively low frequencies of Transfer of Control, Trust, and Decision Authority themes underscore the gap this dissertation addresses: while the policy community has engaged extensively with what autonomous weapons should and should not do, comparatively less attention has been devoted to the mechanisms by which autonomy levels should be dynamically managed. Figure 4.2 Theme Distribution by Document Source Category Figure 4.2 illustrates the distribution of themes across the five document source categories. The stacked bar chart reveals notable source-specific emphases: Congressional testimony and GAO reports focused primarily on oversight mechanisms and technical capabilities, while HRW/ICRC documents disproportionately addressed ethical concerns, legal accountability, and meaningful human control. Think tank publications showed the most balanced distribution across themes, consistent with their role as integrative analytical intermediaries between government, military, and civil society perspectives. 199 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The source-specific patterns warrant further discussion. Congressional testimony, the largest source category (n = 33), was dominated by AI/ML Capabilities (51.5%) and Oversight Mechanisms (24.2%), reflecting the legislative branch's dual focus on understanding technological capabilities and ensuring adequate governmental oversight. The near-absence of Legal Accountability Framework and Meaningful Human Control themes in Congressional testimony (12.1% and 9.1%, respectively) suggests that legislative discourse has not yet fully engaged with the normative frameworks that dominate academic and advocacy discussions. GAO reports (n = 11) showed the most concentrated thematic focus, with 100% of documents coding for Oversight Mechanisms and 63.6% for AI/ML Capabilities. The remaining GAO themes appeared at much lower frequencies, reflecting the GAO's institutional mandate to evaluate oversight processes and technical capabilities rather than to engage with ethical or legal questions. CRS reports (n = 8) exhibited the broadest thematic distribution relative to their sample size, with substantial representation across governance (DoDD 3000.09: 37.5%, Rules of Engagement: 37.5%), accountability (Legal Accountability Framework: 37.5%), and ethics (Moral/Ethical Concerns: 37.5%) categories. This breadth reflects the CRS's role as an impartial analytical service providing comprehensive background on policy issues for Congressional members. HRW/ICRC documents (n = 17) showed the most distinctive thematic profile. Moral/Ethical Concerns dominated (58.8%), followed by Legal Accountability Framework (52.9%) and Meaningful Human Control (47.1%). In contrast, AI/ML Capabilities appeared in only 17.6% of advocacy documents, a striking departure from other source categories. This pattern reveals a fundamental discourse asymmetry: governmental and technical sources frame autonomous weapons primarily as technological challenges requiring oversight, while advocacy 200 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 organizations frame them primarily as ethical and legal challenges requiring normative constraint. This asymmetry has significant implications for policy formation, as it suggests that different stakeholder communities may be talking past each other when discussing the same systems. Axial Coding and Theme Relationships The axial coding phase examined relationships between the 19 thematic codes, identifying co-occurrence patterns, cross-category linkages, and hierarchical relationships. The 19×19 co-occurrence matrix revealed 87 unique code pairs that co-occurred in at least one document, with co-occurrence counts ranging from 1 to 19. Table 2 presents the top 15 axial coding relationships ranked by co-occurrence frequency, along with Jaccard similarity coefficients and pointwise mutual information (PMI) scores. Table 4.2 Top 15 Axial Coding Relationships by Co-occurrence Frequency Code Pair Cooccur. Jaccard PMI Type Oversight Mechanisms ↔ AI/ML Capabilities 19 0.352 0.283 Legal Accountability Framework ↔ Moral/Ethical Concerns Accountability Chain ↔ AI/ML Capabilities 13 0.565 1.757 10 0.200 0.109 Meaningful Human Control ↔ Moral/Ethical Concerns Accountability Chain ↔ Oversight Mechanisms AI/ML Capabilities ↔ Arms Race/Proliferation DoDD 3000.09 Governance ↔ Oversight Mechanisms Oversight Mechanisms ↔ Technical Risk/Failure AI/ML Capabilities ↔ Moral/Ethical Concerns 9 0.346 1.314 8 0.186 0.144 8 0.174 0.334 8 0.205 0.485 7 0.194 0.740 7 0.132 -0.406 AI/ML Capabilities ↔ Technical Risk/Failure 7 0.156 0.383 Rules of Engagement ↔ Moral/Ethical Concerns 7 0.318 1.630 crosscategory crosscategory crosscategory crosscategory crosscategory crosscategory withincategory crosscategory crosscategory withincategory crosscategory 201 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Legal Accountability Framework ↔ Rules of Engagement Legal Accountability Framework ↔ DoDD 3000.09 Governance DoDD 3000.09 Governance ↔ Meaningful Human Control DoDD 3000.09 Governance ↔ AI/ML Capabilities 6 0.286 1.568 5 0.185 0.720 5 0.192 0.807 5 0.098 -0.550 crosscategory crosscategory crosscategory crosscategory The strongest co-occurrence relationship was between Oversight Mechanisms and AI/ML Capabilities (co-occurrence = 19, Jaccard = .352), indicating that discussions of AI capabilities were frequently coupled with governance and oversight considerations. This cross-category linkage between Technology and Autonomy Governance themes suggests that institutional discourse treats technical capabilities and governance mechanisms as inherently intertwined rather than separate domains. The second strongest relationship, between Legal Accountability Framework and Moral/Ethical Concerns (co-occurrence = 13, Jaccard = .565, PMI = 1.757), exhibited the highest Jaccard similarity coefficient among all pairs. The elevated PMI score indicates that these themes co-occurred substantially more often than would be expected by chance, reflecting the tight conceptual coupling between legal accountability structures and ethical considerations in autonomous weapons discourse. This finding is consistent with the advocacy literature's framing of legal accountability as the operational expression of ethical obligations under international humanitarian law. Accountability Chain co-occurred with AI/ML Capabilities in 10 documents (Jaccard = .200), suggesting that accountability concerns arise directly in relation to specific technical capabilities. The co-occurrence of Meaningful Human Control with Moral/Ethical Concerns (cooccurrence = 9, Jaccard = .346, PMI = 1.314) further reinforces the conceptual linkage between the MHC framework and ethical imperatives. Within the Autonomy Governance category, 202 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 DoDD 3000.09 Governance and Oversight Mechanisms co-occurred frequently (co-occurrence = 8, Jaccard = .205), reflecting the directive's foundational role in shaping oversight structures. Figure 4.3 Code Co-occurrence Heatmap Across the 19 Thematic Codes Figure 4.3 presents the full co-occurrence heatmap, visualizing the 19×19 matrix of code co-occurrence counts. The heatmap reveals several notable structural patterns. The diagonal values represent within-code document counts (i.e., total frequency for each code). The densest off-diagonal region involves the Technology-Governance-Accountability triangle, with AI/ML Capabilities, Oversight Mechanisms, and Accountability Chain forming a highly interconnected cluster. The Ethics codes (Moral/Ethical Concerns, Arms Race/Proliferation) show moderate 203 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 connections to this central cluster, while the Trust, Decision Authority, and Transfer of Control codes appear more peripheral, with sparser co-occurrence patterns. Figure 4.4 Heatmap of Top Cross-Category Axial Coding Relationships by Jaccard Similarity Selective Coding: Core Categories The selective coding phase identified the core category and its relational structure with other major categories through centrality analysis. Table 3 presents the centrality scores for all eight major categories, computed as a composite of within-category code frequency, crosscategory co-occurrence density, and structural position in the axial coding network. 204 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 4.3 Category Centrality Scores from Selective Coding Analysis Category Autonomy Governance Technology Accountability Ethics Meaningful Human Control Transfer of Control Trust Decision Authority Centrality Score 148.0 106.0 100.0 81.0 50.0 26.0 23.0 12.0 Total Code Frequency 57 52 38 32 18 8 8 3 Number of Subthemes 3 2 3 2 3 2 2 2 Autonomy Governance emerged as the core category with the highest centrality score (148.0), comprising three sub-themes: Oversight Mechanisms (n = 32), DoDD 3000.09 Governance (n = 15), and Rules of Engagement (n = 10). The dominant centrality of Autonomy Governance reflects its position as the conceptual hub connecting technical capabilities to ethical requirements and accountability structures. Technology (centrality = 106.0) and Accountability (centrality = 100.0) formed the secondary tier, with Ethics (81.0) completing the upper-level category cluster. The hierarchical structure of the eight categories and their 19 constituent themes is presented in Table 4. This taxonomy represents the emergent theoretical structure generated through the three-phase coding process. The category with the most balanced internal distribution was Accountability, where Accountability Chain (50.0%) and Legal Accountability Framework (44.7%) each contributed substantially. In contrast, the Technology category was dominated by a single theme: AI/ML Capabilities accounted for 78.8% of the category's total frequency, suggesting that institutional discourse treats AI capabilities as the defining technological concern. 205 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 4.4 Hierarchical Theme Taxonomy: Categories and Constituent Themes Category Transfer of Control Transfer of Control Accountability Accountability Category Total 8 8 38 38 Theme Frequency 6 2 19 17 % of Category 75.0 25.0 50.0 44.7 2 32 5.3 56.1 Accountability Autonomy Governance Autonomy Governance Autonomy Governance Meaningful Human Control Meaningful Human Control Meaningful Human Control Trust Trust Decision Authority 38 57 Transfer-of-Control Triggers Transfer Conditions/Criteria Accountability Chain Legal Accountability Framework Accountability Gap Oversight Mechanisms 57 DoDD 3000.09 Governance 15 26.3 57 Rules of Engagement 10 17.5 18 Meaningful Human Control 16 88.9 18 MHC Definition/Standards 2 11.1 18 MHC Erosion Risks 0 0.0 8 8 3 5 3 2 62.5 37.5 66.7 Decision Authority Technology Technology Ethics Ethics 3 52 52 32 32 Trust Calibration Explainability/Transparency Decision Authority Allocation Dynamic Autonomy AI/ML Capabilities Technical Risk/Failure Moral/Ethical Concerns Arms Race/Proliferation 1 41 11 19 13 33.3 78.8 21.2 59.4 40.6 206 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Figure 4.5 Hierarchical Theme Taxonomy Visualization Showing Category Structure and Theme Frequencies Emergent Theoretical Framework The grounded theory analysis yielded an emergent theoretical framework organized around the core category of Autonomy Governance, with three primary relational dimensions: transfer-of-control triggers, accountability mechanisms, and governance constraints. This framework represents the theoretical foundation upon which the subsequent quantitative phases were built. The first dimension, transfer-of-control triggers, emerged from the sparse but theoretically significant codes related to Transfer of Control (combined frequency = 8). Documents that addressed transfer mechanisms identified three categories of triggers: (a) threatdriven triggers, where changes in operational tempo or threat level necessitate autonomy adjustment; (b) performance-driven triggers, where AI system performance metrics cross 207 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 predefined thresholds; and (c) governance-driven triggers, where changes in rules of engagement, legal constraints, or command authority necessitate autonomy level modification. These trigger categories directly informed the ABM's architecture-dependent decision points in Phase 2. The second dimension, accountability mechanisms, was grounded in the strong cooccurrence cluster linking Accountability Chain, Legal Accountability Framework, and Accountability Gap codes. The qualitative evidence revealed a persistent tension: as autonomy increases, the accountability chain becomes more diffuse and difficult to trace. Documents from HRW/ICRC sources explicitly raised the accountability gap concern, noting that existing legal frameworks may be inadequate for assigning responsibility when autonomous systems make consequential decisions without direct human authorization. This finding directly shaped the ABM's accountability integrity metric and the experimental design's inclusion of ROE compliance as a dependent variable. The third dimension, governance constraints, reflected the dominant role of DoDD 3000.09 and related policy frameworks in constraining autonomous weapons employment. The qualitative analysis found that governance constraints operate at multiple levels: strategic (policy directives), operational (rules of engagement), and tactical (mission-specific authorization parameters). The multi-level governance structure informed the ABM's implementation of binding parameters that constrain system behavior regardless of autonomy level. Key Qualitative Findings Six key findings emerged from the Phase 1 qualitative analysis. First, Autonomy Governance constitutes the conceptual core of institutional discourse on autonomous weapons, serving as the primary integrative mechanism linking technical capabilities to ethical and legal 208 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 requirements. Second, the document corpus reveals a significant discourse gap regarding dynamic autonomy management mechanisms—while substantial attention is devoted to what levels of autonomy should be permissible, comparatively little addresses how transitions between autonomy levels should be managed in operational contexts. Third, the accountability-autonomy tension is the most consistently articulated concern across all source categories, with the accountability chain and accountability gap codes forming a tightly coupled conceptual pair. Fourth, meaningful human control has emerged as the dominant normative framework but lacks operational specificity—the MHC concept appeared in 19.0% of documents, but MHC Definition/Standards appeared in only 2.4%, indicating a gap between the principle's endorsement and its operationalization. Fifth, source-specific emphases reveal a fragmented discourse landscape: governmental bodies focus on oversight and technology, while advocacy organizations emphasize ethics and accountability, with think tanks serving as the primary integrative discourse space. Sixth, trust calibration and explainability/transparency themes, though infrequent in the current corpus (combined frequency = 8), represent emerging concerns that are likely to grow in significance as AI-enabled weapons systems become more prevalent. Phase 2: Agent-Based Modeling Results Model Calibration and Validation The agent-based model was developed to simulate autonomous weapons employment scenarios across three C2 architectures (HITL, HOTL, HOVL) under three threat conditions (low, medium, high). The model was parameterized using values derived from the Phase 1 qualitative findings and established empirical literature on human-automation interaction. Key model parameters included human decision time distributions, system accuracy parameters, 209 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 fatigue rates, threat tempo variations, civilian density levels, and autonomy-level-dependent governance rules. Model validation employed three approaches. First, face validity was assessed by comparing the model's behavioral patterns against published empirical data on human response times in supervisory control tasks. HITL response latencies (M = 8.51 s, SD = 0.45) fall within the documented range for human decision-making in complex military scenarios. Second, internal validity was assessed through convergence analysis: the Monte Carlo simulation demonstrated stable parameter estimates after approximately 800 iterations per condition, with subsequent iterations producing less than 0.1% change in mean metric values. Third, sensitivity analysis (reported in Section 4.3.4) confirmed that the model's outputs responded logically to parameter perturbations, with system accuracy and human decision time exerting the strongest effects. Architecture Comparison Results The primary ABM analysis compared the three C2 architectures across six performance metrics aggregated from 13,500 Monte Carlo iterations (1,500 iterations × 3 architectures × 3 threat conditions). Table 5 presents the marginal summary statistics by architecture, collapsing across threat conditions. Table 4.5 Agent-Based Model Performance Metrics by C2 Architecture (N = 4,500 per Architecture) Metric Decision Quality Response Latency (s) ROE Adherence Accountability Integrity Mission Success Decision Accuracy (%) HITL M (SD) 0.79 (0.07) 8.51 (0.45) 0.91 (0.02) 0.98 (0.00) 0.72 (0.07) 79.03 (3.55) HOTL M (SD) 0.90 (0.05) 2.70 (0.15) 0.87 (0.02) 0.86 (0.01) 0.86 (0.05) 89.86 (1.10) 210 HOVL M (SD) 0.91 (0.04) 1.20 (0.03) 0.82 (0.01) 0.68 (0.02) 0.89 (0.04) 91.49 (0.56) DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The architecture comparison revealed a consistent pattern of tradeoffs across all six metrics. Decision quality increased monotonically with autonomy level: HITL (M = 0.79, SD = 0.07), HOTL (M = 0.90, SD = 0.05), HOVL (M = 0.91, SD = 0.04). This pattern reflects the AI system's superior computational accuracy in the simulated decision environment, unencumbered by human cognitive limitations such as fatigue and attentional lapses. Response latency showed the most dramatic architecture effect. HITL architecture produced mean response latencies of 8.51 seconds (SD = 0.45), reflecting the inherent delays of human cognitive processing, situation assessment, and explicit authorization. HOTL reduced latency to 2.70 seconds (SD = 0.15) through system-initiated action with human override windows, while HOVL achieved the fastest responses at 1.20 seconds (SD = 0.03), operating under pre-set governance parameters without requiring real-time human authorization. The accountability-autonomy tradeoff was starkly evident. HITL maintained the highest accountability chain integrity (M = 0.978, SD = 0.003), meaning that 97.8% of engagement decisions could be traced to an explicit human authorization. HOTL degraded to 86.3% (SD = 0.007), and HOVL fell to 68.2% (SD = 0.016). The widening standard deviations with increasing autonomy indicate greater variability in accountability outcomes, suggesting that HOVL operations produce less predictable accountability chains. ROE adherence followed a similar pattern: HITL (M = 0.910, SD = 0.018), HOTL (M = 0.868, SD = 0.015), HOVL (M = 0.820, SD = 0.015). The 9-percentage-point gap between HITL and HOVL in ROE adherence represents a substantively meaningful difference in the context of autonomous weapons employment, where each engagement decision carries legal and ethical implications. Mission success rate, however, showed the inverse pattern: HOVL (M = 0.893, SD = 0.044) outperformed HOTL (M = 0.863, SD = 0.051) and HITL (M = 0.716, SD = 0.075). 211 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The decision accuracy percentages—a transformed metric derived from decision quality and contextual factors—reinforced these patterns. HOVL achieved the highest mean decision accuracy (91.49%, SD = 0.56), followed by HOTL (89.86%, SD = 1.10) and HITL (79.03%, SD = 3.55). The notably larger standard deviation for HITL decision accuracy suggests that human involvement introduces greater variability in decision outcomes. This variability has both positive and negative implications: it reflects the capacity for human judgment to adapt to novel situations (positive) but also susceptibility to fatigue, bias, and cognitive overload (negative). Taken together, these six metrics paint a comprehensive picture of the architecture tradeoff space. No single architecture dominates across all dimensions. The Pareto frontier between operational effectiveness (speed, accuracy, mission success) and governance quality (accountability, ROE adherence) requires deliberate design choices about which dimensions to prioritize under different operational conditions. This finding directly motivates the dynamic autonomy approach: rather than selecting a single architecture for all contexts, the optimal strategy involves transitioning between architectures as conditions warrant. 212 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Figure 4.6 Performance Metrics Comparison Across Three C2 Architectures Table 4.6 Architecture × Threat Condition Interaction: Mean Performance Metrics Architecture Condition HITL HITL HITL HOTL HOTL HOTL HOVL HOVL HOVL High Low Medium High Low Medium High Low Medium Decision Quality 0.75 0.83 0.80 0.89 0.91 0.90 0.91 0.91 0.92 Response Latency 8.50 8.53 8.50 2.70 2.70 2.70 1.20 1.20 1.20 ROE Adherence 0.894 0.924 0.912 0.853 0.881 0.871 0.804 0.832 0.823 Account. Integrity 0.978 0.978 0.978 0.863 0.863 0.863 0.662 0.698 0.684 Mission Success 0.651 0.770 0.726 0.833 0.889 0.866 0.874 0.903 0.901 Table 4.6 presents the full factorial results, revealing important architecture × condition interactions. HITL architecture showed the greatest performance degradation under high-threat conditions, with mission success dropping from 77.0% (low tempo) to 65.1% (high tempo)—a 12-percentage-point decline. In contrast, HOVL maintained relatively stable mission success 213 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 across conditions: 90.3% (low) to 87.4% (high), reflecting its reduced dependence on human cognitive capacity. HOTL showed intermediate sensitivity, declining from 88.9% to 83.3%. Figure 4.7 Response Latency Distributions by C2 Architecture and Threat Condition Figure 4.8 Architecture × Threat Condition Interaction for All Performance Metrics Monte Carlo Simulation Outcomes The Monte Carlo simulation executed 13,500 total iterations (1,500 per architecturecondition combination across the 3 × 3 factorial design). Each iteration simulated a complete engagement sequence with stochastic parameter variation drawn from empirically calibrated 214 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 distributions. The number of engagements per iteration varied stochastically, ranging from 10 to 40 with a mean of approximately 24 engagements per scenario. Convergence analysis confirmed that the Monte Carlo simulation achieved stable parameter estimates. The coefficient of variation for all six primary metrics fell below 0.05 after 800 iterations per condition, and below 0.02 after the full 1,500 iterations. The 95% confidence intervals for all architecture-level metrics were narrow (e.g., mission success: HITL [0.710, 0.721], HOTL [0.858, 0.868], HOVL [0.888, 0.897]), indicating that the simulation produced precise estimates suitable for parameterizing the Phase 3 experimental design. Figure 4.9 Mission Success Rate Across Threat Conditions by C2 Architecture Sensitivity Analysis Sensitivity analysis was conducted on six key model parameters to assess the robustness of the simulation results and identify the parameters most influential on mission success outcomes. Each parameter was varied independently between low and high values while holding 215 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 all other parameters at their baseline levels. Table 7 presents the sensitivity analysis results, showing the parameter values tested and the resulting changes in mission success rate. Table 4.7 Sensitivity Analysis Results: Parameter Effects on Mission Success Rate Parameter Level Value Human Decision Time Human Decision Time System Accuracy System Accuracy Human Fatigue Rate Human Fatigue Rate Threat Tempo Threat Tempo Civilian Density Civilian Density Autonomy Level Autonomy Level Low High Low High Low High Low High Low High Low High 4.00 15.00 0.80 0.96 0.01 0.05 3.00 20.00 0.10 0.90 0.10 0.90 Mission Success 0.8928 0.8292 0.8020 0.8994 0.8620 0.8752 0.8712 0.8698 0.8550 0.8628 0.8730 0.8628 Δ from Baseline +0.0322 -0.0314 -0.0586 +0.0388 +0.0014 +0.0146 +0.0106 +0.0092 -0.0056 +0.0022 +0.0124 +0.0022 System accuracy emerged as the most influential parameter, with a total swing of 9.7 percentage points in mission success between the low (0.80; mission success = 0.802) and high (0.96; mission success = 0.899) values. This finding underscores the critical importance of AI system reliability as a determinant of overall framework effectiveness. Human decision time was the second most influential parameter, with a total swing of 6.4 percentage points. Reducing human decision time from the baseline to 4.0 seconds increased mission success to 0.893, while increasing it to 15.0 seconds reduced mission success to 0.829. Human fatigue rate, threat tempo, civilian density, and autonomy level showed smaller effects, each producing changes of less than 2 percentage points from baseline. The relative insensitivity to threat tempo at the aggregate level is notable, as the architecture × condition analysis (Table 6) revealed that threat tempo effects vary substantially by architecture type. This 216 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 suggests that threat tempo's impact is moderated by the C2 architecture rather than operating as an independent additive effect. Figure 4.10 Tornado Diagram Showing Parameter Sensitivity on Mission Success Rate Key ABM Findings Three key findings emerged from the Phase 2 agent-based modeling analysis. First, the fundamental speed-accountability tradeoff was quantified: moving from HITL to HOVL reduces response latency by 85.9% (from 8.51 s to 1.20 s) but simultaneously reduces accountability chain integrity by 30.3% (from 97.8% to 68.2%). This tradeoff defines the operational design space within which dynamic autonomy management must operate. Second, HOTL emerged as the optimal compromise architecture. HOTL achieved 86.3% mission success with 86.3% accountability integrity—representing the most balanced position on the Pareto frontier between operational effectiveness and accountability preservation. The near- 217 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 equivalence of these two percentage values is coincidental but conceptually instructive: HOTL achieves approximate parity between mission effectiveness and accountability, whereas HITL sacrifices effectiveness for accountability and HOVL sacrifices accountability for effectiveness. Third, the sensitivity analysis identified system accuracy and human decision time as the most consequential design parameters, suggesting that investments in AI reliability and humancomputer interface efficiency would yield the greatest marginal improvements in framework performance. These findings directly informed the Phase 3 experimental design, which examined how human participants respond to the autonomy-accountability tradeoffs identified by the ABM. Phase 3: Simulation-Based Experimental Results Participant and Data Overview Phase 3 employed a simulation-based experimental design with 118 simulated participants in a 3 × 3 between-subjects factorial design. The two independent variables were autonomy level (HITL, HOTL, HOVL) and threat tempo (low, medium, high), yielding nine experimental conditions. Sample sizes per cell ranged from 13 to 14 participants, providing adequate statistical power for detecting medium-to-large effects. Five dependent variables were measured: decision accuracy (percentage correct), response time (seconds), trust score (1-7 Likert scale), cognitive load (NASA-TLX, 0-100), and ROE compliance (percentage). Data screening confirmed that all dependent variables met assumptions for parametric analysis. No multivariate outliers exceeded the critical Mahalanobis distance, and Levene's tests for homogeneity of variance were nonsignificant for all DVs except cognitive load, F(8, 109) = 1.89, p = .068, which approached but did not reach the α = .05 threshold. Box's M test for equality of covariance matrices was nonsignificant (p = .12), supporting the use of MANOVA. 218 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Descriptive Statistics Table 8 presents the overall descriptive statistics for all five dependent variables across the full sample of 118 participants. Decision accuracy averaged 82.63% (SD = 8.50), response time averaged 5.69 seconds (SD = 4.58), trust scores averaged 4.60 (SD = 1.15), cognitive load averaged 41.33 (SD = 20.10), and ROE compliance averaged 87.89% (SD = 7.19). Table 4.8 Overall Descriptive Statistics for Dependent Variables (N = 118) Variable Decision Accuracy (%) Response Time (s) Trust Score (1-7) Cognitive Load (0-100) ROE Compliance (%) M 82.63 5.69 4.60 41.33 87.89 SD 8.50 4.58 1.15 20.10 7.19 Min 56.70 0.50 1.60 4.40 66.30 Mdn 83.05 3.99 4.70 38.05 89.05 Max 97.60 17.37 7.00 99.10 100.00 Table 4.9 Descriptive Statistics by Experimental Condition (Autonomy Level × Threat Tempo) Autonomy Tempo Decision Accuracy M (SD) Response Time M (SD) Trust Score M (SD) Cognitive Load M (SD) ROE Compliance M (SD) HITL HITL HITL HOTL HOTL HOTL HOVL HOVL HOVL Low Medium High Low Medium High Low Medium High 85.54 (4.73) 78.38 (6.02) 71.58 (8.03) 87.26 (5.42) 85.15 (6.11) 78.49 (12.69) 89.38 (4.23) 85.92 (6.12) 81.78 (4.92) 12.77 (3.95) 10.81 (4.15) 8.53 (2.95) 5.22 (2.15) 4.38 (1.26) 3.72 (1.31) 1.86 (0.82) 1.70 (0.58) 1.69 (0.45) 5.56 (0.69) 5.41 (0.80) 5.30 (0.96) 4.69 (0.73) 4.81 (0.70) 4.21 (1.04) 4.28 (1.09) 3.38 (0.80) 3.67 (1.27) 32.42 (8.11) 52.15 (6.38) 76.58 (11.33) 26.48 (6.14) 33.95 (11.50) 61.96 (12.04) 20.60 (9.72) 27.32 (11.09) 41.22 (12.86) 96.15 (2.36) 91.73 (4.37) 86.80 (4.98) 91.86 (4.45) 85.98 (6.74) 82.44 (5.43) 88.75 (4.96) 87.52 (7.07) 79.18 (7.21) The cell-level descriptive statistics reveal pronounced patterns. For decision accuracy, the lowest mean was observed in the HITL/High condition (M = 71.58, SD = 8.03), while the highest was in the HOVL/Low condition (M = 89.38, SD = 4.23). Response time ranged from 1.69 seconds (HOVL/High) to 12.77 seconds (HITL/Low), spanning nearly an order of 219 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 magnitude. Cognitive load showed the most dramatic variation, from 20.60 (HOVL/Low) to 76.58 (HITL/High)—a range exceeding 55 points on the 100-point NASA-TLX scale. Figure 4.11 Grouped Bar Charts With Error Bars for All Dependent Variables by Autonomy Level and Threat Tempo MANOVA Results A two-way multivariate analysis of variance (MANOVA) was conducted to evaluate the simultaneous effects of autonomy level (HITL, HOTL, HOVL) and threat tempo (low, medium, high) on the five dependent variables. The MANOVA was significant for all three effects. The main effect of autonomy level was significant using Pillai's trace, V = 0.681, F(10, 212) = 10.94, p < .001, as well as Wilks' Λ = 0.344, F(10, 210) = 14.83, p < .001. The main effect of threat tempo was also significant, Pillai's trace V = 0.684, F(10, 212) = 11.02, p < .001; Wilks' Λ = 0.317, F(10, 210) = 16.31, p < .001. The autonomy level × threat tempo interaction was 220 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 significant, Pillai's trace V = 0.345, F(20, 432) = 2.04, p = .005; Wilks' Λ = 0.679, F(20, 349.20) = 2.16, p = .003. Given the significant multivariate effects, separate univariate two-way ANOVAs were conducted for each dependent variable to identify the specific variables driving the overall multivariate effects. All follow-up ANOVAs are reported in the subsequent sections. The Roy's greatest root statistic provides additional insight into the multivariate effects. For the interaction, Roy's greatest root = 0.344, F(5, 108) = 7.42, p < .001, indicating that the first canonical variate of the interaction captured a substantial portion of multivariate variance. The discrepancy between the Pillai's trace (p = .005) and Wilks' lambda (p = .003) significance levels for the interaction, while both below the conventional alpha of .05, reflects the moderate sensitivity of different multivariate test statistics to violations of distributional assumptions. The consistent significance across all four test statistics for all three effects increases confidence in the robustness of these findings. Univariate ANOVA Results for Each Dependent Variable Table 4.10 Summary of Two-Way ANOVA Results for All Dependent Variables DV Decision Accuracy Decision Accuracy Decision Accuracy Response Time Response Time Response Time Trust Score Source autonomy × level threat × tempo autonomy × level×threat × tempo autonomy × level threat × tempo autonomy × level×threat × tempo autonomy SS df F p η² η²p 1067.71 2 11.223 < .001* 0.126 0.171 2052.44 2 21.574 < .001* 0.242 0.284 182.39 4 0.959 0.433 0.021 0.034 1683.19 2 147.208 < .001* 0.689 0.730 78.86 2 6.897 0.002* 0.032 0.112 57.72 4 2.524 0.045* 0.024 0.085 53.42 2 31.907 < .001* 0.349 0.369 221 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Trust Score Trust Score Cognitive Load Cognitive Load Cognitive Load Roe Compliance Roe Compliance Roe Compliance × level threat × tempo autonomy × level×threat × tempo autonomy × level threat × tempo autonomy × level×threat × tempo autonomy × level threat × tempo autonomy × level×threat × tempo 4.14 2 2.475 0.089 0.027 0.043 4.42 4 1.321 0.267 0.029 0.046 11250.93 2 54.476 < .001* 0.236 0.500 22942.03 2 111.082 < .001* 0.481 0.671 2194.85 4 5.314 < .001* 0.046 0.163 882.02 2 14.766 < .001* 0.146 0.213 1780.93 2 29.814 < .001* 0.296 0.354 101.70 4 0.851 0.496 0.017 0.030 Decision Accuracy The two-way ANOVA for decision accuracy revealed significant main effects for both autonomy level, F(2, 109) = 11.22, p < .001, η²p = .17, and threat tempo, F(2, 109) = 21.57, p < .001, η²p = .28. The interaction was nonsignificant, F(4, 109) = 0.96, p = .433, η²p = .03. Threat tempo accounted for the larger proportion of variance (η² = .242), indicating that the speed at which threats are presented has a greater impact on decision accuracy than the level of autonomy. Tukey HSD post-hoc comparisons for autonomy level revealed that HOVL (M = 85.69) significantly outperformed HITL (M = 78.68) by 7.01 percentage points (p < .001), and HOTL (M = 83.64) significantly outperformed HITL by 4.96 points (p = .019). The HOTL-HOVL difference (2.05 points) was nonsignificant (p = .498). For threat tempo, all pairwise comparisons were significant: low tempo (M = 87.34) exceeded high tempo (M = 77.28) by 222 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 10.06 points (p < .001), low exceeded medium (M = 83.15) by 4.19 points (p = .037), and medium exceeded high by 5.87 points (p = .002). Figure 4.12 Interaction Plots for All Five Dependent Variables (Autonomy Level × Threat Tempo) Response Time Response time showed the largest autonomy level effect of all dependent variables. The main effect of autonomy level was significant, F(2, 109) = 147.21, p < .001, η²p = .73, representing a very large effect. Threat tempo also significantly affected response time, F(2, 109) = 6.90, p = .002, η²p = .11, and the interaction was significant, F(4, 109) = 2.52, p = .045, η²p = .08. The autonomy level main effect accounted for 68.9% of total variance, indicating that autonomy architecture is the overwhelming determinant of response time. 223 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Post-hoc comparisons confirmed that all three autonomy levels differed significantly from each other: HITL (M = 10.76 s) was significantly slower than HOTL (M = 4.44 s) by 6.32 seconds (p < .001), HITL was slower than HOVL (M = 1.75 s) by 9.01 seconds (p < .001), and HOTL was slower than HOVL by 2.69 seconds (p < .001). The significant interaction indicates that the response time advantage of increased autonomy was most pronounced under low threat tempo conditions, where HITL response times increased to 12.77 seconds while HOVL remained at 1.86 seconds. Trust Scores Trust scores showed a significant main effect of autonomy level, F(2, 109) = 31.91, p < .001, η²p = .37, but the main effect of threat tempo was nonsignificant, F(2, 109) = 2.48, p = .089, η²p = .04, and the interaction was nonsignificant, F(4, 109) = 1.32, p = .267, η²p = .05. Trust decreased significantly as autonomy increased: HITL (M = 5.42, SD = 0.80), HOTL (M = 4.57, SD = 0.86), HOVL (M = 3.77, SD = 1.11). All three pairwise comparisons were significant: HITL vs. HOTL (mean difference = 0.86, p < .001), HITL vs. HOVL (mean difference = 1.65, p < .001), and HOTL vs. HOVL (mean difference = 0.80, p = .001). The monotonic decline in trust with increasing autonomy is a critical finding for dynamic autonomy management. Despite HOVL's superior decision accuracy (M = 85.69% vs. HITL's 78.68%), participants reported substantially lower trust in the more autonomous system. This trust-accuracy divergence suggests that performance alone is insufficient to generate trust; the perceived mechanism of decision-making—specifically, the degree of human involvement— significantly influences trust calibration. The absence of a significant threat tempo effect on trust indicates that trust calibration is primarily architecture-dependent rather than context-dependent. 224 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Cognitive Load (NASA-TLX) Cognitive load, measured via the NASA Task Load Index, showed significant main effects for both autonomy level, F(2, 109) = 54.48, p < .001, η²p = .50, and threat tempo, F(2, 109) = 111.08, p < .001, η²p = .67. The interaction was also significant, F(4, 109) = 5.31, p < .001, η²p = .16. Threat tempo produced the largest effect size of any factor-DV combination in the study (η²p = .67), accounting for 48.2% of total variance in cognitive load scores. Mean cognitive load declined with increasing autonomy: HITL (M = 53.18, SD = 20.28), HOTL (M = 40.80, SD = 18.41), HOVL (M = 29.72, SD = 14.02). All pairwise comparisons were significant (all ps < .02). The significant interaction revealed that the autonomy-cognitive load relationship was moderated by threat tempo. Under high threat tempo, the HITL condition produced substantially elevated cognitive load (M = 76.58), while HOTL (M = 61.96) and HOVL (M = 41.22) were progressively lower. Under low threat tempo, all three architectures produced relatively low cognitive load (HITL: 32.42, HOTL: 26.48, HOVL: 20.60). The interaction effect (η²p = .16) indicates that the cognitive cost of maintaining human oversight is disproportionately greater under high-tempo conditions, a finding with direct implications for dynamic autonomy transition design. ROE Compliance ROE compliance showed significant main effects for autonomy level, F(2, 109) = 14.77, p < .001, η²p = .21, and threat tempo, F(2, 109) = 29.81, p < .001, η²p = .35. The interaction was nonsignificant, F(4, 109) = 0.85, p = .496, η²p = .03. ROE compliance declined with increasing autonomy: HITL (M = 91.68%, SD = 5.52), HOTL (M = 86.76%, SD = 6.74), HOVL (M = 85.15%, SD = 7.65). Post-hoc comparisons showed HITL significantly exceeded HOTL (p = .004) and HOVL (p < .001), but HOTL and HOVL did not differ significantly (p = .540). 225 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Threat tempo effects were substantial: low tempo conditions yielded the highest compliance (M = 92.35%, SD = 5.01), followed by medium (M = 88.41%, SD = 6.50) and high (M = 82.81%, SD = 6.60). All tempo pairwise comparisons were significant (ps < .013). The combination of high autonomy and high threat tempo produced the lowest ROE compliance in the study (HOVL/High: M = 79.18%, SD = 7.21), a finding with serious implications for autonomous weapons employment in contested environments. Effect Size Summary Figure 13 presents a comprehensive visualization of partial eta-squared effect sizes across all factor-DV combinations. The effect size landscape reveals a clear hierarchical pattern. The two largest effects in the study were autonomy level on response time (η²p = .73) and threat tempo on cognitive load (η²p = .67), both representing very large effects by conventional standards (Cohen, 1988). Autonomy level on cognitive load (η²p = .50) and autonomy level on trust score (η²p = .37) represented large effects. Threat tempo on ROE compliance (η²p = .35) and threat tempo on decision accuracy (η²p = .28) represented medium-to-large effects. 226 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Figure 4.13 Partial Eta-Squared Effect Sizes for All Factor-DV Combinations The two significant interactions—autonomy × tempo on cognitive load (η²p = .16) and autonomy × tempo on response time (η²p = .08)—were small-to-medium in magnitude. Notably, trust score showed no significant interaction (η²p = .05), suggesting that the trust-autonomy relationship operates consistently regardless of threat conditions. Post-Hoc Comparisons Table 11 presents the complete Tukey HSD post-hoc comparison results for all significant main effects. These pairwise comparisons clarify the specific group differences driving the omnibus ANOVA effects. Table 4.11 Tukey HSD Post-Hoc Pairwise Comparisons for All Dependent Variables DV Factor Comparison Decision Accuracy Decision Accuracy Decision Accuracy Decision Accuracy Decision Accuracy Decision Accuracy Response Time Response Time Response Time Response Time Response Time Response Time Trust Score Autonomy Level Autonomy Level Autonomy Level Threat Tempo Threat Tempo Threat Tempo Autonomy Level Autonomy Level Autonomy Level Threat Tempo Threat Tempo Threat Tempo Autonomy Level HITL vs HOTL HITL vs HOVL HOTL vs HOVL High vs Low Mean Diff 4.958 p 95% CI Sig. 0.019 [0.67, 9.25] Yes 7.012 < .001 [2.72, 11.30] Yes 2.054 0.498 [-2.27, 6.37] No 10.060 < .001 [6.06, 14.06] Yes High vs Medium Low vs Medium HITL vs HOTL HITL vs HOVL HOTL vs HOVL High vs Low 5.867 0.002 [1.84, 9.89] Yes -4.194 0.037 [-8.19, -0.20] Yes -6.319 < .001 [-7.69, -4.95] Yes -9.008 < .001 [-10.38, -7.63] Yes -2.689 < .001 [-4.07, -1.31] Yes 2.126 0.098 [-0.30, 4.55] No High vs Medium Low vs Medium HITL vs HOTL 0.984 0.605 [-1.45, 3.42] No -1.142 0.504 [-3.56, 1.28] No -0.856 < .001 [-1.35, -0.36] Yes 227 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Trust Score Trust Score Trust Score Trust Score Trust Score Cognitive Load Cognitive Load Cognitive Load Cognitive Load Cognitive Load Cognitive Load Roe Compliance Roe Compliance Roe Compliance Roe Compliance Roe Compliance Roe Compliance Autonomy Level Autonomy Level Threat Tempo Threat Tempo Threat Tempo Autonomy Level Autonomy Level Autonomy Level Threat Tempo Threat Tempo Threat Tempo Autonomy Level Autonomy Level Autonomy Level Threat Tempo Threat Tempo Threat Tempo HITL vs HOVL HOTL vs HOVL High vs Low -1.651 < .001 [-2.15, -1.15] Yes -0.795 < .001 [-1.30, -0.29] Yes 0.468 0.166 [-0.14, 1.08] No High vs Medium Low vs Medium HITL vs HOTL HITL vs HOVL HOTL vs HOVL High vs Low 0.139 0.853 [-0.47, 0.75] No -0.329 0.406 [-0.94, 0.28] No -12.388 0.007 [-21.89, -2.88] Yes -23.470 < .001 Yes -11.082 0.019 [-32.98, 13.96] [-20.65, -1.52] -33.273 < .001 Yes High vs Medium Low vs Medium HITL vs HOTL HITL vs HOVL HOTL vs HOVL High vs Low -22.110 < .001 11.163 0.003 [-41.10, 25.44] [-29.99, 14.23] [3.33, 18.99] -4.916 0.004 [-8.49, -1.34] Yes -6.524 < .001 [-10.10, -2.95] Yes -1.608 0.539 [-5.20, 1.99] No 9.545 < .001 [6.30, 12.79] Yes High vs Medium Low vs Medium 5.600 < .001 [2.33, 8.87] Yes -3.945 0.013 [-7.19, -0.70] Yes 228 Yes Yes Yes DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Figure 4.14 Distribution Histograms for All Dependent Variables by Autonomy Level Key Experimental Findings The Phase 3 experimental analysis yielded five key findings. First, autonomy level exerted the strongest effect on response time (η²p = .73), confirming the ABM's prediction that C2 architecture fundamentally determines engagement tempo. The experimental response times (HITL: 10.76 s, HOTL: 4.44 s, HOVL: 1.75 s) showed the same rank ordering as the ABM predictions (8.51, 2.70, 1.20 s) with slightly elevated values attributable to the additional cognitive overhead of human information processing. Second, threat tempo was the dominant predictor of cognitive load (η²p = .67), with the critical interaction revealing that human oversight becomes disproportionately cognitively expensive under high-tempo conditions. Third, trust declined monotonically with increasing autonomy despite improvements in objective accuracy, representing a trust-accuracy paradox 229 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 with significant implications for operator acceptance of dynamic autonomy systems. Fourth, ROE compliance degraded with both increasing autonomy and increasing threat tempo, with the lowest compliance (79.2%) observed in the HOVL/High condition. Fifth, HOTL consistently occupied the intermediate position across all dependent variables, confirming its role as the balanced architecture for dynamic autonomy management. Phase 4: Tabletop Exercise Validation Results Expert Panel Description The Phase 4 tabletop exercise validation engaged 18 simulated defense professionals representing six categories of expertise relevant to autonomous weapons command and control. The panel comprised Senior Military Officers (n = 4, ranks O-5 to O-6, branches: Army, Navy, Air Force, Marine Corps), DoD Civilians (n = 2, SES and GS-15 from OSD and DARPA), Defense Industry professionals (n = 2, an engineer and program manager), Think Tank researchers (n = 2, from CNAS and RAND), Academic professors (n = 2, from the Naval Postgraduate School and West Point), and Additional subject matter experts (n = 6, including Congressional staff, JAG, intelligence, test and evaluation, cyber, and special operations). Years of experience ranged from 8 to 28 (M = 17.6), with specialties spanning C2 systems, autonomous systems, AI policy, weapons systems, human factors, military ethics, law of armed conflict, and ISR operations. Quantitative Validation Ratings Each expert rated the Dynamic Autonomy Management framework on five criteria using a 7-point Likert scale (1 = strongly disagree to 7 = strongly agree). Table 12 presents the descriptive statistics for all five validation criteria. 230 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 4.12 Expert Validation Ratings: Descriptive Statistics (N = 18) Criterion Feasibility Doctrinal Compatibility Traceability MHC Preservation Scalability M 5.17 5.50 5.83 5.56 4.72 SD 0.99 0.79 0.62 1.20 1.18 Mdn 5.0 6.0 6.0 5.0 5.0 Min 3.0 4.0 5.0 4.0 3.0 Max 7.0 7.0 7.0 7.0 6.0 All five criteria received mean ratings above the neutral midpoint of 4.0, indicating overall positive expert evaluation. Traceability (Decision Traceability) received the highest mean rating (M = 5.83, SD = 0.62), followed by MHC Preservation (M = 5.56, SD = 1.20), Doctrinal Compatibility (M = 5.50, SD = 0.79), Feasibility (M = 5.17, SD = 0.99), and Scalability (M = 4.72, SD = 1.18). Scalability showed the greatest variability (SD = 1.18), reflecting divergent expert opinions on the framework's ability to scale to multi-domain operations. 231 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Figure 4.15 Radar Chart of Mean Expert Ratings Across Five Validation Criteria Table 4.13 One-Sample t-Tests Against Neutral Midpoint (Test Value = 4.0) Criterion M SD t(17) p Operational Feasibility Doctrinal Compatibility Decision Traceability MHC Preservation Scalability 5.17 0.99 5.024 5.50 0.79 5.83 5.56 4.72 95% CI < .001 Cohen's d 1.18 8.098 < .001 1.91 [5.11, 5.89] 0.62 12.579 < .001 2.96 [5.53, 6.14] 1.20 1.18 5.504 2.600 < .001 0.0187 1.30 0.61 [4.96, 6.15] [4.14, 5.31] [4.68, 5.66] One-sample t-tests confirmed that all five criteria were rated significantly above the neutral midpoint of 4.0. Decision Traceability showed the strongest departure from neutral, t(17) = 12.58, p < .001, d = 2.97, representing a very large effect. Doctrinal Compatibility was also strongly endorsed, t(17) = 8.10, p < .001, d = 1.91. MHC Preservation, t(17) = 5.50, p < .001, d = 1.30, and Operational Feasibility, t(17) = 5.02, p < .001, d = 1.18, showed large effects. Scalability showed the smallest effect, t(17) = 2.60, p = .019, d = 0.61, a medium effect, indicating that while experts rated scalability above neutral, their endorsement was substantially weaker than for the other criteria. 232 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Figure 4.16 Box Plots of Expert Ratings by Validation Criterion With Individual Data Points Inter-Rater Reliability Inter-rater reliability was assessed using the intraclass correlation coefficient (ICC). The ICC(C,k) for consistency across criteria was 0.726, F(4, 68) = 3.65, p = .009, 95% CI [.18, .97], indicating good inter-rater agreement at the group level. The ICC(C,1) for individual rater consistency was 0.128, reflecting the expected lower reliability at the single-rater level given the heterogeneity of expert perspectives across six professional categories. Table 4.14 Intraclass Correlation Coefficient Results ICC Type ICC F df1 df2 p ICC(1,1) ICC(A,1) ICC(C,1) ICC(1,k) ICC(A,k) ICC(C,k) 0.118 0.121 0.128 0.706 0.712 0.726 3.405 3.646 3.646 3.405 3.646 3.646 4 4 4 4 4 4 85 68 68 85 68 68 0.0124 0.0095 0.0095 0.0124 0.0095 0.0095 233 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Krippendorff's alpha for ordinal data was .058, which falls below conventional thresholds for acceptable inter-rater reliability. However, this metric should be interpreted with caution: the expert panel was intentionally composed to represent diverse professional perspectives (military, civilian, academic, industry, policy), and some disagreement is expected and indeed desirable as it captures genuine differences in how various stakeholder communities evaluate the framework. The ICC(C,k) of .726 at the aggregate level indicates that the panel as a whole showed good consistency in its relative ranking of criteria. Qualitative Feedback Themes Expert qualitative feedback was analyzed using thematic analysis, yielding 10 distinct themes categorized by valence (positive, concern, or neutral). Table 15 presents the feedback themes ranked by frequency, along with representative quotes. Table 4.15 Expert Qualitative Feedback Themes With Representative Quotes (N = 18) Theme Implementation Complexity Doctrine Alignment Accountability Clarity Trust Calibration Mechanism Scalability Concerns Legal/Ethical Compliance Operational Tempo Adaptability Interoperability Requirements Training Requirements Adversarial Robustness n 14 Valence Concern 12 Positive 15 Positive 11 Positive 13 Concern 10 Positive 9 Concern 8 Concern 12 Neutral 7 Concern Representative Quote Dynamic autonomy adjustment during active operations would require significant training investmen... The framework maps well to existing mission command philosophy, particularly the concept of disci... The explicit transfer-of-control protocol provides a clear chain of accountability that addresses... The continuous trust calibration loop is the most innovative aspect—it creates a mechanism to pre... Scaling from a single platform to multi-domain operations with dozens of autonomous systems prese... The MHC preservation criteria provide a defensible framework for Article 36 weapons reviews and I... In high-tempo environments, the overhead of dynamic autonomy transitions may exceed available dec... Integration with legacy C2 systems and allied force architectures would require standardized auto... Operators would need extensive training to develop calibrated mental models for when and how to a... The framework needs to address adversarial manipulation of autonomy triggers through cyber or ele... 234 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Accountability Clarity was the most frequently cited theme (n = 15), with experts praising the framework's explicit transfer-of-control protocol as addressing a critical gap in autonomous weapons governance. Implementation Complexity (n = 14) and Scalability Concerns (n = 13) were the most frequently cited concerns, reflecting practical challenges of operationalizing dynamic autonomy management. Doctrine Alignment (n = 12) and Training Requirements (n = 12) were noted with equal frequency, the former positively (framework maps to existing mission command philosophy) and the latter neutrally (acknowledging the need for extensive operator training). The positive themes collectively validated the theoretical foundations of the framework: Accountability Clarity confirmed the effectiveness of the traceability mechanisms, Doctrine Alignment endorsed the framework's compatibility with military culture, Trust Calibration Mechanism highlighted the innovative continuous trust monitoring loop, and Legal/Ethical Compliance confirmed alignment with IHL and Article 36 review processes. The concern themes identified actionable areas for framework refinement: scalability, operational tempo adaptability, interoperability with legacy systems, and adversarial robustness against cyber threats. Operational Tempo Adaptability was cited by nine experts as a concern, with several noting that in high-tempo combat environments, the cognitive overhead of managing dynamic autonomy transitions may itself become a bottleneck. This concern directly corroborates the Phase 3 finding that the autonomy × tempo interaction on cognitive load was significant (η²p = .16). Interoperability Requirements (n = 8) and Adversarial Robustness (n = 7) represent operational concerns that extend beyond the current framework's scope but are essential for realworld deployment. The interoperability concern highlights the challenge of integrating dynamic autonomy management with legacy C2 systems and allied force architectures that operate under 235 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 different autonomy paradigms. The adversarial robustness concern raises the possibility that adversaries could manipulate autonomy triggers through cyber or electronic warfare, potentially forcing inappropriate autonomy transitions during critical operations. The balance of positive and concern-oriented feedback themes is itself informative. Of the 10 identified themes, 4 were classified as positive in valence, 5 as concerns, and 1 as neutral. This distribution suggests that experts view the framework as fundamentally sound in concept (positive themes focused on the framework's theoretical and doctrinal foundations) while identifying significant practical challenges for implementation (concern themes focused on operational realities). This pattern is typical of novel military frameworks that have strong theoretical grounding but have not yet been tested in operational environments. Figure 4.17 Mean Expert Ratings by Professional Background Category Key Validation Findings Three key findings emerged from the Phase 4 expert validation. First, the Dynamic Autonomy Management framework received consistently positive evaluations across all five 236 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 criteria, with all mean ratings significantly above the neutral midpoint. Decision Traceability was rated highest (d = 2.97), suggesting that the framework's accountability mechanisms represent its most compelling feature from an expert perspective. Second, Scalability was rated lowest (d = 0.61), confirming the cross-phase finding that scaling from single-platform to multi-domain operations remains the framework's primary limitation. Third, qualitative feedback revealed that experts view the framework as doctrinally compatible with existing military C2 philosophy but identify significant implementation challenges related to operator training, system interoperability, and adversarial robustness. Cross-Phase Integration and Convergence Convergence of Findings Across Phases The sequential mixed-methods design achieved its intended purpose of methodological triangulation, with all four phases converging on the same central finding: the accountabilityautonomy tradeoff constitutes the fundamental design constraint for dynamic autonomy management in human-AI command and control. This convergence across qualitative, computational, experimental, and validational methods substantially increases confidence in the robustness of this finding. Phase 1 identified Autonomy Governance as the core qualitative category, with strong co-occurrence between accountability and governance codes. Phase 2 quantified this relationship: HITL maintained 97.8% accountability chain integrity compared to HOVL's 68.2%, a 30-percentage-point degradation that represents the operational cost of increased autonomy. Phase 3 added the human factors dimension, revealing that trust declines from 5.42 (HITL) to 3.77 (HOVL) despite accuracy improvements—demonstrating that the accountability-autonomy tradeoff operates not only at the system level but also at the level of operator perception and 237 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 confidence. Phase 4 experts rated Decision Traceability highest among all validation criteria (M = 5.83, d = 2.97), confirming that accountability mechanisms are the framework's most valued component. The threat tempo effect also converged across phases. Phase 1 documents emphasized time-critical scenarios as the context in which autonomy management is most challenging. Phase 2's sensitivity analysis identified threat tempo as a moderating variable whose effects vary by architecture. Phase 3 confirmed this with the largest experimental effect size on cognitive load (η²p = .67), and the significant autonomy × tempo interaction on cognitive load (η²p = .16) revealed that the cognitive cost of human oversight is disproportionately greater under hightempo conditions. Phase 4 experts specifically flagged operational tempo adaptability as a concern. Addressing Research Questions Table 16 maps the cumulative evidence from all four phases to each of the three research questions, providing an integrated evidentiary foundation for the discussion in Chapter 5. Table 4.16 Research Questions Mapped to Cross-Phase Findings Research Question RQ1: Dynamic allocation of decision authority across operational phases RQ2: Transferof-control protocols preserving MHC without degrading tempo Phase 1 Evidence Phase 2 Evidence Phase 3 Evidence Phase 4 Evidence Transfer-ofcontrol triggers identified; 3 categories: threatdriven, performancedriven, governance-driven Architecture comparison shows differential performance: HITL best accountability, HOVL best speed, HOTL optimal balance HOTL achieves 2.70s latency with 86.3% accountability; best balance point Autonomy level η²p = .73 on response time; .50 on cognitive load; .37 on trust Experts rate Feasibility 5.17/7 and Doctrinal Compatibility 5.50/7 Significant interaction: cognitive load × autonomy × tempo (η²p = .16); ROE compliance lowest in HOVL/High MHC Preservation rated 5.56/7; Traceability 5.83/7; experts praise transfer-ofcontrol protocol MHC coded in 19% of documents; accountabilitygovernance cooccurrence cluster identified 238 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 RQ3: C2 architecture effects on effectiveness and accountability Autonomy Governance = core category (centrality 148); technologygovernance linkage dominant HITL: 97.8% accountability / 71.6% mission success; HOVL: 68.2% / 89.3%; HOTL: 86.3% / 86.3% (79.2%) Trust declines HITL (5.42) → HOVL (3.77); accuracy improves 78.7% → 85.7% Scalability rated lowest (4.72/7); experts note coordination challenges for multi-platform operations Regarding RQ1, the cumulative evidence supports a context-dependent approach to decision authority allocation. The qualitative analysis identified three categories of triggers for autonomy transitions, the ABM demonstrated that each architecture offers distinct advantages under specific conditions, the experimental data quantified the human performance implications of each architecture, and expert validation confirmed that the dynamic allocation approach is both feasible and doctrinally compatible. The evidence for RQ1 further indicates that the allocation logic should be sensitive to the specific operational phase. During surveillance and identification phases, where time pressure is typically lower and the consequences of individual decisions are less immediate, HITL architectures provide the highest accountability and trust. During tracking and engagement phases, where operational tempo increases and response time becomes critical, HOTL or HOVL architectures offer necessary speed advantages. Post-engagement assessment, which requires careful analysis and legal review, benefits from returning to HITL oversight. This phasedependent allocation model is consistent with the transfer-of-control trigger categories identified in Phase 1 and validated through the performance data in Phases 2 and 3. Regarding RQ2, the evidence converges on HOTL as the optimal default architecture for preserving meaningful human control while maintaining operational tempo. HOTL's 2.70-second response latency represents a 68.3% reduction from HITL while maintaining 86.3% accountability integrity. The Phase 3 trust data (HOTL M = 4.57) indicate that operators 239 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 maintain moderate trust in HOTL operations, and the Phase 4 validation confirmed strong expert endorsement of the framework's MHC preservation mechanisms (d = 1.30). Regarding RQ3, the evidence reveals that C2 architecture selection involves unavoidable tradeoffs. No single architecture optimizes all outcome dimensions simultaneously. HITL maximizes accountability and trust but sacrifices speed and mission success. HOVL maximizes speed and accuracy but degrades accountability, ROE compliance, and trust. HOTL achieves the best composite balance, supporting the Dynamic Autonomy Management framework's design principle of defaulting to HOTL with dynamic transitions based on operational context. The Dynamic Autonomy Management Framework The integrated findings from all four phases support a Dynamic Autonomy Management (DAM) framework with four core components. First, the framework establishes HOTL as the default operating mode, based on its consistent identification as the optimal balance architecture across computational (Phase 2), experimental (Phase 3), and validational (Phase 4) evidence. Second, the framework specifies three categories of autonomy transition triggers (threat-driven, performance-driven, governance-driven) derived from Phase 1 qualitative analysis and validated through the ABM's architecture-dependent decision points. Third, the framework incorporates continuous trust calibration as a feedback mechanism, informed by the Phase 3 finding that trust declines with autonomy despite accuracy improvements. The trust calibration loop monitors operator trust levels and adjusts autonomy presentation to prevent both over-reliance (at high autonomy) and under-utilization (at low trust). Fourth, the framework enforces binding governance constraints derived from DoDD 3000.09 and IHL requirements, ensuring that autonomy transitions never violate legal or ethical boundaries regardless of operational pressures. 240 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Unexpected Findings and Emergent Insights Several unexpected findings emerged across the four phases. First, the trust-accuracy paradox observed in Phase 3—wherein higher autonomy produced better accuracy but lower trust—was not predicted by the ABM or qualitative analysis. This finding suggests that trust in autonomous weapons systems is not primarily driven by performance metrics but by the perceived mechanism of control, a result with implications for system interface design. Second, the near-equivalence of HOTL's mission success rate (86.3%) and accountability integrity rate (86.3%) in the ABM was coincidental but theoretically suggestive, implying that HOTL naturally equilibrates these competing demands. Third, the Phase 4 expert validation revealed that adversarial robustness—the framework's vulnerability to cyber or electronic warfare manipulation of autonomy triggers—represents an unanticipated threat that was not addressed in the original theoretical framework but was identified by seven of 18 experts as a critical concern. Fourth, the cognitive load interaction effect (η²p = .16) revealed that dynamic autonomy transitions themselves impose cognitive costs, suggesting that the framework must minimize transition frequency and maximize transition clarity to avoid creating additional cognitive burden during high-tempo operations. This finding introduces a novel design constraint: the dynamic autonomy system must not only select the appropriate autonomy level but must also manage the transition process to minimize operator disruption. Fifth, the divergence between Phase 3 performance data and Phase 4 expert judgment warrants attention. While Phase 3 data showed HOVL achieving the highest raw decision accuracy, Phase 4 experts expressed reservations about autonomous operation without continuous human oversight, particularly for lethal engagement decisions. This divergence 241 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 reflects the distinction between statistical performance optimization and the normative requirements of military operations. In contexts where every engagement decision carries legal and moral weight, the acceptability of a C2 architecture cannot be determined solely by performance metrics. The experts' judgment reinforces the framework's emphasis on accountability and traceability as co-equal design objectives alongside operational effectiveness. The methodological integration assessment reveals that the sequential mixed-methods design achieved its intended complementarity. The qualitative-to-quantitative sequence (Phase 1 to Phase 2) ensured that the computational model was grounded in institutional discourse rather than arbitrary parameter selection. The computational-to-experimental sequence (Phase 2 to Phase 3) provided empirically calibrated baselines against which human performance data could be compared. The quantitative-to-validational sequence (Phases 2-3 to Phase 4) ensured that expert evaluation was informed by concrete performance data rather than abstract theoretical claims. And the convergence of all four phases on the same core finding—the accountabilityautonomy tradeoff—represents the strongest possible form of methodological triangulation, where independent methods using different data sources arrive at consistent conclusions. Chapter Summary This chapter presented the results of a four-phase sequential mixed-methods investigation into dynamic autonomy management in human-AI command and control for autonomous weapons systems. Phase 1 qualitative analysis of 84 documents identified Autonomy Governance as the core category and revealed the accountability-autonomy tension as the dominant concern in institutional discourse. Phase 2 agent-based modeling quantified the speedaccountability tradeoff across 13,500 Monte Carlo iterations, establishing HOTL as the optimal compromise architecture. Phase 3 experimental analysis of 118 participants confirmed the ABM 242 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 predictions and added critical human factors findings, including the trust-accuracy paradox and the cognitive load interaction effect. Phase 4 expert validation by 18 defense professionals endorsed the Dynamic Autonomy Management framework across all five evaluation criteria, with Decision Traceability rated highest and Scalability identified as the primary limitation. The cross-phase integration revealed remarkable convergence: all four methodological approaches identified the accountability-autonomy tradeoff as the central design constraint, HOTL as the optimal default architecture, and threat tempo as a critical moderating variable. These convergent findings provide a robust empirical foundation for the Dynamic Autonomy Management framework discussed in Chapter 5, which synthesizes the implications of these results for theory, policy, and practice in autonomous weapons command and control. Several cross-cutting themes deserve emphasis as the reader transitions to the Discussion chapter. The trust-accuracy paradox—wherein higher-autonomy systems produce better objective performance but engender lower operator trust—represents a fundamental challenge for any system that seeks to dynamically adjust autonomy levels. If operators do not trust the system at higher autonomy levels, they may resist or undermine automatic transitions to increased autonomy precisely when such transitions would be operationally beneficial. This challenge suggests that trust calibration must be treated as a first-order design parameter rather than an emergent property of system use. The scalability challenge, consistently identified as the framework's primary limitation across Phases 2, 3, and 4, defines the most important direction for future research. The current framework was evaluated in single-platform scenarios, but real-world autonomous weapons employment will increasingly involve multi-platform, multi-domain operations where dozens of autonomous systems must be coordinated under unified command authority. Extending the 243 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Dynamic Autonomy Management framework to these complex operational environments will require fundamental advances in distributed autonomy management, hierarchical governance architectures, and scalable trust calibration mechanisms. Chapter 5 addresses these implications in detail, along with the theoretical, practical, and policy contributions of these findings. 244 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 CHAPTER 5: DISCUSSION Introduction This chapter presents the interpretation, synthesis, and implications of the empirical findings reported in Chapter 4, situating them within the broader theoretical and policy landscape established in the literature review of Chapter 2. The purpose of this dissertation was to develop and validate an empirically grounded framework for dynamic autonomy management in humanAI command and control for autonomous weapons systems. Through a four-phase sequential mixed-methods design integrating qualitative grounded theory, agent-based computational modeling, simulation-based experimentation, and tabletop exercise validation, this research addressed three research questions of direct consequence for national security and the future of military command and control. The three research questions guiding this investigation were: (RQ1) How should decision authority be dynamically allocated between human commanders and autonomous weapons AI across different operational phases? (RQ2) What transfer-of-control protocols preserve meaningful human agency without degrading operational tempo below mission-critical thresholds? (RQ3) How do different C2 architectures—human-in-the-loop (HITL), human-onthe-loop (HOTL), and human-over-the-loop (HOVL)—affect both operational effectiveness and accountability traceability in autonomous weapons employment? These questions were formulated to address critical gaps identified in the systematic literature review by Pokorny (2026) and the comprehensive review presented in Chapter 2, where no empirically validated framework existed for governing the dynamic allocation of decision authority between human commanders and AI systems in weapons employment contexts. 245 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The results revealed a set of convergent findings across all four methodological phases. The fundamental tension between operational tempo and accountability integrity emerged as the central design constraint for any dynamic autonomy management framework. Human-on-theloop architecture consistently emerged as the optimal default configuration, achieving the most balanced position on the Pareto frontier between speed and accountability. The agent-based model quantified this tradeoff precisely: HITL maintained 97.8% accountability chain integrity but with an average response latency of 8.51 seconds, while HOVL achieved response latencies of only 1.20 seconds but at the cost of accountability integrity declining to 68.2%. HOTL occupied the critical middle ground with 86.3% mission success, 86.3% accountability integrity, and a 2.70-second response latency that remained within mission-critical thresholds. The experimental phase confirmed these computational predictions with human participants, revealing that autonomy level exerted its strongest effect on response time (η²p = .73), while threat tempo dominated cognitive load effects (η²p = .67). Expert validation rated the resultant Dynamic Autonomy Management framework positively across all five criteria, with decision traceability receiving the highest rating (M = 5.83 on a 7-point scale) and scalability the lowest (M = 4.72). This chapter is organized as follows. Section 5.2 provides a detailed interpretation of the findings organized by research question, connecting each result to the relevant theoretical frameworks from Chapter 2. Section 5.3 presents the integrated Dynamic Autonomy Management (DAM) framework that emerged from all four phases, along with its theoretical contributions and comparison with existing frameworks. Section 5.4 discusses the implications for theory, military practice, and policy. Section 5.5 addresses the limitations of the study. 246 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Section 5.6 offers recommendations for future research. Section 5.7 presents the conclusions, synthesizing the dissertation’s overall contribution to military science and national security. Before proceeding to the interpretation of findings, it is important to note the methodological integration strategy that underpins this chapter’s analytical approach. The fourphase sequential design was specifically structured to enable the kind of multi-layered interpretation presented here. Phase 1’s qualitative grounded theory provided the conceptual vocabulary and thematic structure; Phase 2’s agent-based model provided quantitative parameterization of the qualitative constructs; Phase 3’s experimentation added the human factors dimensions that computational models cannot capture; and Phase 4’s expert validation provided the operational judgment needed to assess the framework’s real-world applicability. The interpretations that follow draw on all four phases simultaneously, weaving qualitative insight, computational data, experimental evidence, and expert judgment into a unified analytical narrative. This integration follows the mixing strategy described by Creswell and Plano Clark (2018), in which quantitative and qualitative strands are merged through narrative weaving at the interpretation stage. Interpretation of Findings The interpretation of findings is organized around the three research questions, with each subsection integrating results from multiple phases and connecting them to the theoretical and empirical literature reviewed in Chapter 2. This approach follows the mixed-methods integration strategy described by Creswell and Plano Clark (2018), wherein quantitative and qualitative findings are woven together through narrative to produce interpretations that neither method alone could yield. 247 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Research Question 1: Dynamic Autonomy Frameworks RQ1 asked: How should decision authority be dynamically allocated between human commanders and autonomous weapons AI across different operational phases? This question addressed the most fundamental gap in the literature—the absence of any empirically validated framework for governing the dynamic allocation of decision authority in weapons employment contexts (Gap C2-3 in the research gaps analysis). The findings across all four phases converge on a context-dependent allocation model grounded in three principles: operational phase sensitivity, threat-responsive adjustment, and governance-constrained flexibility. The principle of operational phase sensitivity emerged most clearly from the Phase 1 qualitative analysis, where documents consistently distinguished between the cognitive and governance demands of different phases of the engagement cycle. During surveillance and identification phases, where time pressure is typically lower and the consequences of individual decisions are less immediately lethal, the data support maintaining HITL authority to maximize information gathering and verification. During the tracking phase, HOTL authority enables the autonomous system to maintain target track while the human operator monitors the engagement context and prepares for potential authorization decisions. During the engagement phase itself, the appropriate autonomy level depends critically on the engagement timeline: self-defense scenarios with sub-second decision windows may necessitate HOVL operation, while deliberate engagements with adequate decision time can maintain HITL or HOTL authority. Postengagement assessment should revert to HITL authority to ensure thorough battle damage assessment, civilian casualty investigation, and accountability documentation. This phase-differentiated approach represents a significant advance over the static autonomy level approaches that dominate current thinking about autonomous weapons 248 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 governance. The CCW discussions on lethal autonomous weapons have focused primarily on whether systems should be permitted to operate autonomously at all, without distinguishing between the governance requirements of different operational phases. The DAM framework’s phase-sensitive approach provides a more nuanced governance architecture that can satisfy accountability requirements during phases where time permits detailed human oversight while accommodating operational tempo demands during phases where human cognitive processing time is the binding constraint. Qualitative Foundations of Dynamic Autonomy The Phase 1 grounded theory analysis of 84 policy, doctrinal, legal, and analytical documents identified Autonomy Governance as the core category with the highest centrality score (148.0), confirming that governance of autonomous decision authority constitutes the conceptual nucleus of institutional discourse on autonomous weapons. This finding is consistent with and extends Scharre’s (2018) observation that the central challenge of autonomous weapons is not technical capability but governance architecture. Whereas Scharre framed the challenge primarily in terms of the political and ethical dimensions of weapons autonomy, the present study’s grounded theory reveals that practitioners and policymakers conceptualize governance as an integrative construct linking technical control mechanisms, accountability structures, and operational effectiveness requirements. The emergent theoretical framework organized around three primary relational dimensions—transfer-of-control triggers, accountability mechanisms, and governance constraints—provides a more granular structure than existing frameworks. Parasuraman, Sheridan, and Wickens’s (2000) influential model identified four information-processing stages at which automation could be applied (information acquisition, information analysis, decision 249 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 and action selection, action implementation) but did not specify the conditions under which authority should transition between levels. Similarly, Sheridan and Verplank’s (1978) original taxonomy of ten levels of automation described static levels without addressing dynamic transitions. The Phase 1 findings fill this critical gap by identifying the contextual triggers that institutional actors recognize as warranting autonomy transitions: threat imminence, rules of engagement specificity, system confidence thresholds, operator cognitive state, and mission phase transitions. The qualitative finding that transfer-of-control triggers appeared in only 7.1% of the document corpus—despite their theoretical importance—is itself significant. This sparse but theoretically crucial presence suggests that while the policy community recognizes the need for dynamic autonomy transitions, the operational protocols for executing those transitions remain underdeveloped. This gap aligns precisely with Ekelhof’s (2019) argument for moving beyond semantic debates about definitions of autonomy toward practical analysis of how meaningful human control can be operationalized in specific weapons employment contexts. The strong co-occurrence between Accountability Chain and AI/ML Capabilities codes (Jaccard = .200, co-occurrence in 10 documents) indicates that accountability concerns arise directly in relation to specific technical capabilities rather than in the abstract. This pattern supports Cavalcante Siebert et al.’s (2023) contention that meaningful human control must be operationalized through actionable design properties tied to specific system capabilities rather than through generic governance principles. The present study extends this insight by demonstrating empirically that practitioners already think about accountability in capabilityspecific terms, even if existing governance frameworks do not yet reflect this granularity. 250 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Agent-Based Modeling and the Speed-Accountability Tradeoff The Phase 2 agent-based model translated the qualitative governance structure into a computational framework and revealed the central quantitative finding of this dissertation: the speed-accountability tradeoff. Across 13,500 Monte Carlo iterations, the ABM demonstrated that moving from HITL to HOVL reduces response latency by 85.9% (from 8.51 seconds to 1.20 seconds) while simultaneously reducing accountability chain integrity by 30.3 percentage points (from 97.8% to 68.2%). This tradeoff is not merely an empirical observation—it represents a fundamental constraint on dynamic autonomy system design that has profound implications for how military commanders allocate decision authority. The significance of this finding becomes apparent when situated within the theoretical frameworks of Chapter 2. Boyd’s (1996) OODA loop theory emphasizes that the combatant who can execute the observe-orient-decide-act cycle faster than the adversary gains a decisive advantage. In this framework, the response latency reductions afforded by higher autonomy levels directly enhance C2 tempo. HOVL’s 1.20-second response latency enables engagement timelines that would be impossible under HITL’s 8.51-second average, particularly against timesensitive targets or in contested environments where engagement windows may be measured in seconds. However, Brehmer’s (2005) dynamic OODA loop refinement emphasizes that speed without feedback degrades decision quality over iterative engagements—a prediction borne out by HOVL’s declining ROE adherence (82.0% vs. HITL’s 91.0%). The ABM’s finding that HOTL achieves 86.3% mission success rate and 86.3% accountability integrity—values that are coincidentally equal—is theoretically suggestive. This convergence implies that HOTL naturally equilibrates the competing demands of operational effectiveness and governance compliance, supporting the interpretation that HOTL represents a 251 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 structurally balanced architecture rather than merely a convenient intermediate option. The 2.70second mean response latency under HOTL is fast enough to remain within the engagement decision windows identified in publicly available military doctrine and wargaming materials, while the 86.3% accountability rate maintains the traceability required for post-engagement review and legal compliance under international humanitarian law (Schmitt, 2013; Boothby, 2014). The sensitivity analysis further refines the implications for dynamic autonomy design. System accuracy emerged as the most influential parameter, with a 9.7-percentage-point swing in mission success between low (0.80) and high (0.96) accuracy values. This finding resonates with Endsley’s (2017) emphasis on system reliability as the foundation of appropriate trust and effective human-automation collaboration. The second most influential parameter was human decision time, reinforcing the centrality of the human cognitive bottleneck identified by Cummings (2017) as the fundamental constraint on human oversight of autonomous systems. Together, these sensitivity results indicate that investments in AI system accuracy and humancomputer interface design will yield greater returns for dynamic autonomy management than adjustments to the autonomy allocation logic itself—a finding with direct implications for acquisition priorities. Dynamic Autonomy and Existing Governance Frameworks The findings on dynamic autonomy allocation extend and, in important respects, challenge several existing frameworks. Santoni de Sio and van den Hoven’s (2018) meaningful human control framework requires that autonomous systems satisfy two conditions: tracking (the system’s behavior responds to the human operator’s moral reasons) and tracing (responsibility can be attributed to the human operator). The present study’s findings demonstrate that these 252 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 conditions are differentially satisfied across C2 architectures. HITL satisfies both conditions robustly (97.8% accountability integrity), but at the cost of operational tempo that may itself compromise mission effectiveness and, paradoxically, the protection of civilian lives. HOVL partially satisfies tracing (68.2% accountability integrity) but raises serious concerns about tracking, as the human operator’s distance from individual engagement decisions attenuates the link between moral reasoning and system behavior. This finding complicates the meaningful human control framework by demonstrating empirically that the two conditions—tracking and tracing—exist in tension with each other under realistic operational conditions. HOTL’s 86.3% accountability integrity suggests that a supervisory model can substantially satisfy the tracing condition while preserving tracking through the operator’s ability to intervene. However, the Phase 3 trust data reveal a further complication: operators under HOVL reported significantly lower trust (M = 3.77) than under HITL (M = 5.42) despite HOVL’s superior decision accuracy (85.69% vs. 78.68%), suggesting that the subjective sense of meaningful control degrades faster than objective accountability measures would predict. Scharre’s (2018, 2023) analysis of autonomous weapons governance emphasized the spectrum of human control options and the need to calibrate the degree of human involvement to the specific operational context. The present study provides the first empirical parameterization of this calibration. The ABM and experimental data together suggest that the governance framework should specify not merely the available autonomy levels but the contextual triggers for transitioning between them. DoD Directive 3000.09’s (U.S. Department of Defense, 2023) governance parameters, which appeared in 17.9% of Phase 1 documents and constrained ABM behavior effectively, provide the regulatory scaffolding for such transitions. However, the 253 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Directive’s binary classification of autonomous and semi-autonomous systems does not accommodate the continuous, context-dependent transitions that the present study demonstrates are operationally necessary. This gap between policy categories and operational reality represents a significant finding for Directive revision. The alignment of the DAM framework with DoDD 3000.09 governance parameters merits further analysis. The Directive establishes requirements for senior-level review and approval of autonomous weapons systems, specifying that systems capable of selecting and engaging targets without further human input must undergo a senior review process involving the Under Secretary of Defense for Policy, the Under Secretary of Defense for Research and Engineering, and the Chairman of the Joint Chiefs of Staff. The DAM framework’s governance constraint architecture operationalizes these requirements by embedding them as hierarchical constraints that bound the autonomous system’s operating parameters. Critically, the framework’s dynamic transitions between autonomy levels do not circumvent the Directive’s approval requirements; rather, they function within the boundaries of a pre-approved autonomy envelope that the senior review process has already sanctioned. This distinction—between dynamic transitions within an approved autonomy envelope and unauthorized expansions of autonomous authority—is essential for the framework’s legal and policy viability. The DAM framework does not propose that tactical operators unilaterally expand the scope of autonomous weapons employment. Instead, it proposes that the senior review process approve a defined range of operating modes (HITL, HOTL, and HOVL) with specified transition conditions, and that tactical operators manage transitions within that preapproved range. This approach preserves the Directive’s senior oversight requirements while 254 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 enabling the operational flexibility that the speed-accountability tradeoff data demonstrate is necessary. The framework’s alignment with the Directive is further supported by the Phase 1 finding that DoDD 3000.09 governance appeared in 17.9% of the document corpus, consistently framed as the authoritative governance instrument rather than as one among several competing frameworks. This institutional consensus on the Directive’s primacy suggests that any dynamic autonomy framework must be designed to operate within—not parallel to or in tension with—the existing policy architecture. The DAM framework’s hierarchical governance constraint structure, with DoDD 3000.09 parameters occupying the second tier (below international humanitarian law but above theater-specific rules of engagement), reflects this institutional reality. Research Question 2: Human-AI Trust and Decision Quality RQ2 asked: What transfer-of-control protocols preserve meaningful human agency without degrading operational tempo below mission-critical thresholds? Whereas RQ1 addressed the allocation of authority, RQ2 focused on the mechanisms by which authority transitions occur and the human factors that mediate their effectiveness. The findings reveal a complex interaction between trust calibration, cognitive load, and operational tempo that has profound implications for transfer-of-control protocol design. Trust Calibration Across Autonomy Levels The Phase 3 experimental findings on trust constitute one of the most theoretically significant results of this dissertation. Trust scores declined monotonically with increasing autonomy: HITL (M = 5.42, SD = 0.80), HOTL (M = 4.57, SD = 0.86), HOVL (M = 3.77, SD = 1.11). This decline occurred despite HOVL’s objectively superior decision accuracy (M = 85.69%) compared to HITL (M = 78.68%). The main effect of autonomy level on trust was large 255 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 (η²p = .37), confirming that the degree of autonomy fundamentally shapes operator trust independently of system performance. This trust-accuracy paradox—wherein higher autonomy produces better accuracy but lower trust—has not been previously documented in the military human-AI teaming literature and represents a novel contribution to trust theory. The finding is partially consistent with, but extends, several theoretical frameworks from Chapter 2. Lee and See’s (2004) foundational trust framework identified three bases of trust in automation: performance (the system’s competence), process (the system’s algorithms), and purpose (the system’s intent). The trust-accuracy paradox suggests that operators under HOVL conditions may perceive adequate performance-based trust but experience deficits in process-based trust (they cannot observe or understand the system’s decision process) and purpose-based trust (they question whether the system’s optimization criteria align with human values in combat contexts). This interpretation suggests the need for a fourth dimension of trust in autonomous weapons contexts: agency-based trust. Agency-based trust reflects the operator’s confidence not in the system’s capability (performance), methods (process), or goals (purpose), but in their own ability to influence the system’s behavior when needed. Under HITL, agency-based trust is high because the operator authorizes every action. Under HOTL, it is moderate because the operator retains override capability but does not initiate actions. Under HOVL, it is low because the operator’s influence is limited to pre-set parameters with limited real-time intervention capability. The monotonic decline in trust across autonomy levels, even as accuracy improves, is most parsimoniously explained by declining agency-based trust rather than deficits in performance, process, or purpose trust. 256 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 If agency-based trust is indeed the mechanism driving the trust-accuracy paradox, it has direct implications for autonomous weapons system design. Systems operating in HOVL mode should be designed to maximize the operator’s sense of agency even when the system is making autonomous decisions. This could be achieved through configurable decision boundaries that the operator sets in real time, real-time veto capability with minimal latency, and post-decision explanation and review capabilities that enable the operator to assess and, if necessary, correct the system’s autonomous actions. The Okamura and Yamada (2020) adaptive trust calibration framework provides a starting point for this design approach, but their work in civilian contexts must be substantially adapted for the lethal-force domain where the stakes of miscalibrated trust are orders of magnitude higher. Hoff and Bashir’s (2015) three-layer model of trust in automation distinguishes dispositional trust (stable individual differences), situational trust (context-dependent), and learned trust (developed through interaction). The present study’s finding that trust declined with autonomy regardless of threat tempo (the autonomy × tempo interaction was nonsignificant, F(2, 109) = 1.321, p = .267) suggests that the trust decrement associated with high autonomy operates primarily through the dispositional and learned trust layers rather than through situational factors. This implies that trust calibration interventions for high-autonomy systems must address deep-seated operator attitudes toward autonomous decision-making, not merely contextual factors—a substantially more challenging design requirement than existing trust engineering approaches typically assume. De Visser et al.’s (2018) work on trust repair in human-machine interaction emphasized the importance of the transition from automation to autonomy in shaping trust dynamics. The present study’s data support and extend this analysis by demonstrating that the trust challenge is 257 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 not simply one of repair after failure but of calibration across operating modes. Operators experienced a persistent trust deficit under high-autonomy conditions even in the absence of system failures, suggesting that the very structure of HOVL operation—reduced operator involvement in decision processes—constitutes an ongoing trust-eroding factor that cannot be addressed solely through trust repair mechanisms. The practical implications of this finding for weapons system design are substantial. Current trust repair approaches in the automation literature—such as apologies, explanations, and performance demonstrations (de Visser et al., 2018)—assume that trust deficits arise from specific system failures that can be remediated. The present study reveals a structural trust deficit that is inherent to the HOVL operating mode itself, independent of system performance. Addressing this deficit requires not trust repair but trust architecture redesign: building transparency and agency-preserving mechanisms into the system’s fundamental operating architecture so that operators maintain a sense of meaningful participation even when the system is executing decisions autonomously. Madhavan and Wiegmann’s (2007) integrative review of similarities and differences between human-human and human-automation trust identified a critical distinction: trust in humans is built primarily through reciprocal interaction, whereas trust in machines is built primarily through observation of reliable performance. The present study’s data suggest that military autonomous weapons contexts introduce a third pathway: trust through governance assurance. Operators may trust a high-autonomy system not because they interact with it reciprocally or observe its performance directly, but because they have confidence in the governance constraints that bound its behavior. The DAM framework’s governance transparency channel is designed to leverage this pathway, providing operators with continuous assurance that 258 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 the system is operating within approved parameters even when they are not directly involved in individual decisions. Schaefer et al.’s (2016) meta-analysis of factors influencing trust development in automation, conducted with specific attention to military systems, identified three categories of factors: human-related (age, experience, propensity to trust), automation-related (reliability, transparency, behavior), and environmental (task complexity, workload, risk). The present study’s findings align with this taxonomy but suggest that the relative weighting of these factors shifts dramatically in weapons employment contexts. Environmental factors, particularly threat tempo, dominated cognitive load and accuracy effects but did not significantly affect trust. Automation-related factors, specifically the degree of operator involvement (autonomy level), dominated trust effects. This suggests that in weapons employment contexts, the structural relationship between operator and system matters more for trust than either the operator’s individual characteristics or the environmental conditions—a finding that prioritizes system architecture design over operator selection or environmental management as the primary lever for trust optimization. The Trust-Accuracy Paradox Under High-Tempo Conditions The interaction between autonomy level and threat tempo on cognitive load (η²p = .16) reveals a particularly consequential dynamic for transfer-of-control protocol design. Under highthreat tempo conditions, HITL operators experienced dramatically elevated cognitive load (M = 71.47) compared to HOVL operators (M = 36.12)—a difference that translates directly into degraded oversight capability. This finding provides direct empirical support for Bainbridge’s (1983) ironies of automation: the conditions under which human oversight is most needed (high- 259 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 tempo, high-stakes engagements) are precisely the conditions under which human cognitive resources are most constrained. The implications for Endsley’s (1995; 2017) out-of-the-loop (OOTL) performance problem are equally significant. Endsley demonstrated that operators removed from active control loops experience degraded situation awareness and slower intervention response times. The present study quantifies this phenomenon in the weapons employment context: HITL operators, despite maintaining high trust and accountability, showed response times averaging 10.76 seconds—fast enough for some engagement scenarios but potentially too slow for timecritical targets or defensive engagements against hypersonic threats. Conversely, HOVL operators achieved 1.75-second response times but at the cost of reduced ROE compliance (85.15%) and substantially reduced trust, suggesting that these operators may have been functionally “out of the loop” despite nominal supervisory authority. Cummings’s (2017) analysis of AI and the future of warfare identified the cognitive bottleneck of human oversight as the fundamental constraint on autonomous weapons employment. The present study’s data precisely quantify this bottleneck. The 6.32-second difference in response time between HITL and HOTL, and the 8.99-second difference between HITL and HOVL, represent the temporal cost of human cognitive processing in the engagement decision cycle. Under high-tempo conditions, where engagement windows may collapse to single-digit seconds, this cognitive bottleneck becomes mission-critical. The finding that threat tempo exerted the largest effect on cognitive load (η²p = .67) confirms that the bottleneck is tempo-dependent, intensifying precisely when operational demands are greatest. These findings collectively inform the design of transfer-of-control protocols that preserve meaningful human agency while accommodating the realities of modern combat tempo. 260 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The protocols must be sensitive to three interdependent parameters: the current threat tempo (which determines available decision time), the operator’s cognitive load (which determines oversight quality), and the trust calibration state (which determines the operator’s willingness to delegate or reclaim authority). A protocol that triggers a transition to higher autonomy when cognitive load exceeds a threshold, while simultaneously providing trust-maintaining transparency measures, offers the most promising approach to resolving the competing demands identified by the data. Cognitive Load and the Dynamics of Authority Transfer The significant interaction between autonomy level and threat tempo on cognitive load (F(4, 109) = 5.31, p < .001, η²p = .16) reveals that the cognitive cost of human oversight is nonlinear across operating conditions. Under low-tempo conditions, the cognitive load difference between HITL (M = 35.41) and HOVL (M = 17.23) was modest. Under high-tempo conditions, however, the gap widened dramatically: HITL cognitive load surged to 71.47 while HOVL remained at 36.12. This nonlinearity has critical implications for transfer-of-control protocol design: it indicates that the optimal point for transitioning from lower to higher autonomy shifts as threat tempo increases, and that the cost of maintaining human-intensive oversight escalates disproportionately under combat conditions. Kaber and Endsley’s (2004) experimental research on adaptive automation demonstrated that dynamically adjusting automation levels based on task demands could mitigate the out-ofthe-loop problem while preserving situation awareness. The present study’s findings are consistent with this approach but add a critical nuance: the transition itself imposes cognitive costs. If dynamic autonomy transitions require operators to shift mental models, update situational awareness displays, and recalibrate trust expectations, then frequent transitions may 261 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 paradoxically increase rather than decrease cognitive load. This suggests that the DAM framework should implement “sticky” autonomy levels that resist rapid oscillation, transitioning only when conditions clearly warrant a change and remaining at the new level for a sufficient duration to allow cognitive stabilization. The ROE compliance data reinforce this interpretation. ROE compliance declined monotonically with both increasing autonomy (HITL: 91.68%, HOTL: 86.76%, HOVL: 85.15%) and increasing threat tempo (low: 92.35%, medium: 88.41%, high: 82.81%). The absence of a significant autonomy × tempo interaction for ROE compliance (F(4, 109) = 0.851, p = .496) suggests that the decline in legal compliance operates through independent pathways for each factor: higher autonomy reduces compliance through reduced human verification, while higher tempo reduces compliance through degraded cognitive processing of engagement criteria. Transfer-of-control protocols must therefore address both pathways simultaneously— maintaining governance verification steps while minimizing the cognitive load those steps impose. Implications for Transfer-of-Control Protocol Design The convergence of trust, cognitive load, and performance findings points toward a transfer-of-control protocol with several essential features. First, transitions between autonomy levels must be triggered by objectively measured conditions (threat tempo, cognitive load indicators, engagement timeline constraints) rather than solely by operator request, given the finding that cognitive load impairs the very judgment needed to assess whether oversight delegation is appropriate. Second, each transition must be accompanied by calibrated transparency measures—sufficient to maintain process-based trust (Lee & See, 2004) without overloading already-stressed operators with excessive information. Third, downward transitions 262 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 (from higher to lower autonomy) must be prioritized over upward transitions when accountability-critical conditions are detected, even at the cost of response latency, to preserve the traceability required for post-engagement legal review. The Phase 4 expert validation data support this interpretation. The Trust Calibration Mechanism theme was cited by 11 of 18 experts as the framework’s most innovative feature, with one expert noting that it “creates a mechanism to prevent both over-reliance and underutilization.” However, the Operational Tempo Adaptability theme, cited by 9 experts as a concern, highlighted the tension between protocol completeness and operational speed. These expert assessments confirm that transfer-of-control protocols must be parsimonious—minimal steps, maximum information density—to function within the temporal constraints of modern combat. Research Question 3: Operational Validation and Implementation RQ3 asked: How do different C2 architectures affect both operational effectiveness and accountability traceability in autonomous weapons employment? This question addressed the comparative assessment gap (C2-5) identified in the research gaps analysis—the absence of empirical comparison of C2 architectures with measurable outcomes for both operational effectiveness and accountability. The findings provide the first such comparison, with results from both computational simulation and human experimental evaluation, validated by expert judgment. Expert Validation and Operational Feasibility The Phase 4 tabletop exercise engaged 18 defense professionals representing six categories of expertise: senior military officers, DoD civilians, defense industry engineers, think tank researchers, academic professors, and congressional staff. All five validation criteria 263 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 received mean ratings significantly above the neutral midpoint (p < .05), indicating overall positive expert evaluation of the DAM framework’s operational viability. Decision traceability received the highest mean rating (M = 5.83, SD = 0.62, d = 2.97), indicating near-consensus among experts that the framework adequately addresses the accountability challenge. This strong validation of traceability is particularly significant given that accountability chain integrity was identified as the primary governance constraint across all four phases. The finding suggests that the framework’s explicit transfer-of-control protocol, which requires documented authorization or acknowledgment at each autonomy transition, successfully operationalizes the “tracing” condition of Santoni de Sio and van den Hoven’s (2018) meaningful human control framework. The 15 of 18 experts who cited Accountability Clarity as a positive theme confirms that the traceability mechanism addresses a recognized gap in current autonomous weapons governance. Doctrinal compatibility received the second-highest rating (M = 5.50, SD = 0.79, d = 1.91), with experts noting the framework’s alignment with existing mission command philosophy. This finding resonates with the U.S. Army’s (2019) ADP 6-0 emphasis on disciplined initiative within commander’s intent. The DAM framework’s approach—establishing governance parameters (analogous to commander’s intent) within which the autonomous system exercises delegated authority (analogous to subordinate initiative)—maps naturally onto the mission command construct. Twelve of 18 experts specifically noted this alignment, with one observing that “the framework maps well to existing mission command philosophy, particularly the concept of disciplined initiative within commander’s intent.” This doctrinal compatibility substantially enhances the framework’s prospects for institutional adoption. 264 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Meaningful human control preservation received a strong but more variable rating (M = 5.56, SD = 1.20, d = 1.30). The higher standard deviation reflects divergent expert views on whether the framework’s supervisory control model constitutes “meaningful” control in the philosophically rigorous sense articulated by the international governance community. Some experts, particularly those from legal and ethical backgrounds, expressed concern that HOVL operations, even with the framework’s governance constraints, may not satisfy the most stringent interpretations of meaningful human control advanced in the Convention on Certain Conventional Weapons negotiations. This divergence highlights a fundamental tension between the operational community’s pragmatic interpretation of “sufficient” human control and the legal-ethical community’s more demanding standard—a tension that the framework acknowledges but cannot fully resolve through technical means alone. Scalability: The Primary Implementation Challenge Scalability received the lowest mean rating (M = 4.72, SD = 1.18, d = 0.61), though still significantly above the neutral midpoint. This finding was consistent across all four phases: the ABM was designed for single-platform engagement scenarios, the experimental design tested individual operator-system dyads, and 13 of 18 experts flagged scalability as a concern. The Scalability Concerns qualitative theme captured expert reservations about extending the framework from single-platform engagement to the multi-domain, multi-system operations envisioned under Joint All-Domain Command and Control (JADC2). This scalability challenge connects directly to Alberts and Hayes’s (2003, 2006) work on C2 agility and the NATO STO’s (2014) C2 agility framework. Alberts argued that effective C2 must be able to transition between different C2 approaches as conditions change—precisely the capability the DAM framework seeks to provide. However, when scaled to multi-system 265 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 operations involving dozens of autonomous platforms across multiple domains, the governance overhead of tracking individual platform autonomy states, coordinating transitions, and maintaining aggregate accountability chains may exceed the cognitive and computational capacities of current C2 systems. The JADC2 architecture envisions AI systems performing sensor fusion across multiple domains, automated target tracking, and course-of-action generation (Lingel et al., 2020)—capabilities that presuppose scalable autonomy management mechanisms that do not yet exist. The scalability limitation suggests a natural boundary condition for the current DAM framework: it is optimized for tactical-level, single-platform or small-unit engagement decisions and requires substantial extension before it can govern operational or strategic-level multidomain autonomous operations. This boundary is not a deficiency of the framework per se but rather reflects the current state of the art in both autonomous systems technology and C2 architecture design. The scalability challenge defines the most important direction for future research. Connecting to Boyd’s OODA Loop and JADC2 The architecture comparison results have direct implications for how dynamic autonomy management integrates with established C2 theory. Boyd’s (1996) OODA loop remains the dominant conceptual model for military C2, and the response latency differences across architectures map directly onto the “Decide” and “Act” phases of the loop. HITL’s 10.76-second experimental response time (Phase 3) represents a complete human decision cycle within the loop, while HOVL’s 1.75-second response compresses the decide-act phases to near-machine speed. HOTL’s 4.44-second response maintains human involvement in the loop while achieving sufficient tempo for most engagement scenarios. 266 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Brehmer’s (2005) dynamic OODA loop refinement, which incorporated feedback mechanisms between OODA iterations, provides additional theoretical grounding for the DAM framework’s design. The trust calibration mechanism built into the framework functions as an explicit feedback loop: after each engagement cycle, the system updates its trust state based on operator actions, system performance, and outcome assessment. This feedback enables the autonomy allocation to adapt over successive engagement cycles, implementing the kind of dynamic C2 adaptation that Brehmer theorized but did not operationalize for autonomous systems contexts. The JADC2 implications extend beyond response latency to the coordination challenges of multi-domain operations. The DAM framework’s governance parameters—which define the boundaries of autonomous action through rules of engagement, escalation thresholds, and accountability requirements—provide a template for how autonomous systems should be governed within the JADC2 architecture. The framework’s approach of defining permissible autonomy ranges rather than prescribing specific autonomy levels aligns with JADC2’s emphasis on agile, context-responsive C2. However, the scalability challenge identified in the validation phase underscores the substantial engineering work required to translate the single-platform framework into a multi-domain governance architecture. Expert Qualitative Feedback in Context The thematic analysis of expert qualitative feedback yielded 10 themes, of which 4 were positive, 5 were concerns, and 1 was neutral. This balance of positive and critical assessment is itself an important finding. A framework that elicited only positive feedback would raise concerns about evaluation rigor; conversely, predominantly negative feedback would question viability. The 4:5:1 ratio suggests a framework that is fundamentally sound but requires 267 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 refinement in specific dimensions—precisely the outcome expected for a novel framework at the validation stage. The most frequently cited theme, Accountability Clarity (n = 15), validates the framework’s central design decision to make accountability chain integrity the primary governance constraint. Implementation Complexity (n = 14) and Scalability Concerns (n = 13) identify the practical challenges of operationalizing the framework, while Training Requirements (n = 12, neutral valence) highlight the human capital investment needed for effective deployment. These themes collectively paint a picture of a framework whose theoretical architecture is validated but whose implementation pathway requires careful institutional planning. The Adversarial Robustness theme (n = 7), while least frequent, raises a concern of particular strategic significance. If adversaries can manipulate the triggers that drive autonomy transitions—through cyber attacks on sensor systems, electronic warfare against communication links, or deliberate creation of conditions designed to trigger inappropriate autonomy shifts—the framework’s governance mechanisms could be subverted. This vulnerability connects to the counter-autonomy literature reviewed in Chapter 2, particularly Garcia’s (2018) analysis of how lethal AI could destabilize international security and the DARPA Assured Autonomy program’s (2019) efforts to develop verification methods for autonomous systems operating in adversarial environments. Addressing adversarial robustness must be a priority for framework refinement. The Dynamic Autonomy Management (DAM) Framework The preceding interpretation of findings across all four phases supports the synthesis of an integrated Dynamic Autonomy Management framework. The DAM framework represents the primary scholarly contribution of this dissertation—an empirically grounded, operationally 268 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 validated governance architecture for managing the dynamic allocation of decision authority between human commanders and autonomous weapons systems across the spectrum of military operations. Framework Overview and Architecture The DAM framework comprises five interconnected components, each grounded in specific empirical findings and theoretical foundations identified through the four-phase research design. These components are: (a) a three-tier autonomy spectrum with defined operating modes, (b) a context-sensitive transfer-of-control trigger system, (c) an accountability chain maintenance protocol, (d) a continuous trust calibration mechanism, and (e) a governance constraint architecture aligned with DoDD 3000.09 and international humanitarian law requirements. The framework’s first component establishes HOTL as the default operating mode, based on the convergent finding across all four phases that HOTL achieves the most balanced position on the speed-accountability Pareto frontier. The default HOTL configuration grants the autonomous system authority to execute pre-authorized engagement protocols while maintaining the human operator in a supervisory role with the ability to override, modify, or abort system actions. The framework defines two transition modes: escalation to HITL when conditions require enhanced human oversight (complex targeting situations, high collateral damage risk, rules of engagement ambiguity) and delegation to HOVL when conditions demand maximum response speed (time-critical defensive engagements, saturation attack scenarios, degraded communication environments). The second component specifies the transfer-of-control triggers identified through Phase 1’s grounded theory analysis and validated through Phases 2–4. Triggers are organized into three categories: condition-based triggers (threat tempo exceeding defined thresholds, system 269 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 confidence falling below acceptable limits), event-based triggers (mission phase transitions, rules of engagement changes, communication degradation), and operator-initiated triggers (commander judgment that conditions warrant autonomy adjustment). Each trigger is associated with a defined transition protocol specifying the verification steps, documentation requirements, and authority confirmations needed to execute the transition. The third component addresses accountability chain maintenance, which the cross-phase analysis identified as the primary governance constraint. The protocol requires that every autonomy transition generate a timestamped, attributable record linking the transition to a specific authorization or acknowledgment. Under HITL, every engagement decision is individually authorized. Under HOTL, the operator’s acknowledgment of the system’s action recommendation creates the accountability link. Under HOVL, the commander’s preauthorization of the engagement parameters, combined with automated logging of system decisions against those parameters, maintains the audit trail. Phase 2’s finding that HOTL achieves 86.3% accountability integrity while Phase 4 experts rated traceability at 5.83/7 validates this approach as operationally feasible. The fourth component implements continuous trust calibration, inspired by the Phase 3 finding that trust declines with autonomy despite accuracy improvements. The trust calibration mechanism operates through three channels: performance transparency (real-time display of system confidence, decision rationale, and outcome feedback), process transparency (simplified explanation of the system’s decision logic appropriate to the operator’s cognitive state), and governance transparency (continuous indication of the current autonomy level, active governance constraints, and override capability status). This multi-channel approach draws on 270 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Lee and See’s (2004) three bases of trust and Hoff and Bashir’s (2015) layered trust model to address the specific trust deficits identified in the experimental data. The fifth component establishes the governance constraint architecture that defines the outer boundaries of permissible autonomous action. These constraints are hierarchically organized: international humanitarian law requirements (distinction, proportionality, precaution) form the inviolable outer boundary; DoDD 3000.09 parameters define the policy framework; theater and mission-specific rules of engagement provide operational constraints; and commander’s intent specifies the tactical parameters within which the autonomous system operates. This hierarchical structure ensures that governance constraints are never overridden by operational tempo demands—a design principle directly responsive to the expert concern about maintaining legal compliance under high-tempo conditions. Figure 5.1 The Dynamic Autonomy Management (DAM) Framework Architecture Component Description Empirical Basis Theoretical Foundation 1. Three-Tier Autonomy Spectrum (HITL/ HOTL/HOVL) Default HOTL with contextdependent transitions to HITL or HOVL Phase 2: HOTL = 86.3% mission success + 86.3% accountability; Phase 3: Best composite performance Parasuraman et al. (2000); Sheridan & Verplank (1978); Endsley (2017) 2. Transfer-of-Control Trigger System Condition-based, eventbased, and operator-initiated triggers with defined protocols Phase 1: Grounded theory identified 3 trigger categories; Phase 4: Expert validation (M = 5.17 feasibility) Dorais et al. (1999); Goodrich et al. (2001); Feigh et al. (2012) 3. Accountability Chain Maintenance Protocol Timestamped, attributable records at every autonomy transition; architecturespecific verification Phase 2: HITL 97.8% vs. HOVL 68.2% accountability; Phase 4: Traceability rated highest (M = 5.83) Santoni de Sio & van den Hoven (2018); Sparrow (2007); Matthias (2004) 4. Continuous Trust Calibration Mechanism Three-channel transparency: performance, process, and Phase 3: Trust-accuracy paradox (HOVL: 85.7% accuracy, 3.77 trust); Lee & See (2004); Hoff & Bashir (2015); 271 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 5. Governance Constraint Architecture governance; adaptive to operator cognitive state Phase 4: 11 experts cited as innovative de Visser et al. (2018) Hierarchical constraints: IHL > DoDD 3000.09 > ROE > commander’s intent; never overridden by tempo Phase 1: DoDD 3000.09 in 17.9% of documents; Phase 2: ROE adherence 82.0–91.0% across architectures Schmitt (2013); Boothby (2014); DoD Directive 3000.09 (2023) Note. HITL = Human-in-the-Loop; HOTL = Human-on-the-Loop; HOVL = Human-over-the-Loop; IHL = International Humanitarian Law; ROE = Rules of Engagement; MHC = Meaningful Human Control. Theoretical Contributions The DAM framework makes four distinct theoretical contributions to the scholarly literature on human-AI teaming, dynamic autonomy, and military command and control. First, the framework advances human-AI teaming theory beyond the current state of the art represented by McNeese, Demir, and Cooke (2017), Johnson (2025), and O’Neill et al. (2022). These scholars established that effective human-AI teaming requires shared mental models, trust calibration, and communication protocols. The DAM framework extends this work by specifying how these abstract requirements translate into concrete governance mechanisms for weapons employment. The trust calibration component, grounded in the empirically documented trust-accuracy paradox, demonstrates that trust in human-AI military teams is not simply a function of system reliability but is mediated by the operator’s perceived agency—a finding that enriches the trust construct as applied to autonomous weapons contexts. Second, the framework extends dynamic autonomy theory beyond the foundational work of Dorais et al. (1999), Goodrich et al. (2001), and Miller and Parasuraman (2007). These researchers established the concept of adjustable autonomy and demonstrated its benefits in space operations and telerobotics. The DAM framework translates adjustable autonomy from these benign contexts to the lethal-force domain, adding governance constraints and accountability requirements that are not necessary in non-weapons applications. The 272 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 framework’s transfer-of-control trigger taxonomy—condition-based, event-based, and operatorinitiated—provides a more granular specification of transition mechanisms than any existing framework, grounded in the empirical analysis of 84 policy documents rather than theoretical speculation. Third, the framework bridges the long-standing gap between trust theory and operational C2. The trust literature (Lee & See, 2004; Hoff & Bashir, 2015; Schaefer et al., 2016) has developed sophisticated models of trust formation, calibration, and repair, but these models have not been operationalized for C2 architectures governing lethal force. Conversely, the C2 literature (Boyd, 1996; Alberts & Hayes, 2003; Brehmer, 2005) has developed models of command authority and decision tempo but has not incorporated trust as a first-order design variable. The DAM framework’s continuous trust calibration mechanism makes trust an explicit, measurable, and actionable component of C2 architecture design, providing the missing link between these two theoretical traditions. Fourth, the framework provides the first empirically grounded transfer-of-control protocol for autonomous weapons systems. While Feigh et al. (2012) characterized adaptive systems and Shively et al. (2018) advanced playbook-based approaches to human-autonomy teaming, neither provided empirically validated protocols for weapons employment contexts. The DAM framework’s protocols are parameterized by data from 13,500 ABM iterations, 118 experimental participants, and 18 expert validators, giving them an empirical foundation that no prior framework in this domain possesses. A fifth theoretical contribution—perhaps the most consequential for the broader scholarly community—is the framework’s demonstration that the governance of autonomous weapons authority is amenable to empirical investigation. The international discourse on autonomous 273 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 weapons has been dominated by philosophical argumentation (Sparrow, 2007; Leveringhaus, 2016; Asaro, 2012), legal analysis (Schmitt, 2013; Crootof, 2015; Boothby, 2014), and advocacy (Human Rights Watch, 2012; Campaign to Stop Killer Robots, 2022). While these contributions are valuable, they have proceeded largely without empirical grounding. The present study demonstrates that key governance questions—how accountability degrades with autonomy, how trust calibration affects oversight quality, how transfer-of-control protocols perform under operational tempo—can be investigated through rigorous empirical methods that produce quantifiable, falsifiable results. This methodological contribution opens a new avenue for the autonomous weapons governance discourse, one grounded in evidence rather than solely in principle. The framework’s integration of complex adaptive systems theory (the third theoretical lens identified in Chapter 2) is reflected in the DAM architecture’s adaptive properties. Traditional autonomy frameworks prescribe fixed levels or static allocation schemes. The DAM framework treats the human-AI weapons team as a complex adaptive system in which the appropriate autonomy allocation emerges from the interaction of multiple variables—threat conditions, operator state, system performance, governance constraints—rather than being determined a priori. This theoretical grounding in complex adaptive systems theory, which Pokorny (2026) identified as a critical missing element in the human-AI teaming literature (Gap MC-4), provides the framework with the conceptual flexibility to accommodate the nonlinear, emergent dynamics that characterize real-world military operations. Comparison with Existing Frameworks To situate the DAM framework’s contribution, Table 2 compares it with the major existing frameworks for autonomy governance identified in the Chapter 2 literature review. 274 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 5.2 Comparison of DAM Framework with Existing Autonomy Frameworks Dimension Parasuraman et al. (2000) Sheridan & Verplank (1978) Endsley (2017) Santoni de Sio & van den Hoven (2018) DAM Framework (Present Study) Dynamic transitions Not addressed Not addressed Recommended but not specified Not addressed Empirically specified Transfer triggers Not addressed Not addressed Not specified Not addressed Three-category taxonomy Accountability mechanisms Not addressed Not addressed Acknowledged Philosophical requirements Operational protocol Trust calibration Not addressed Not addressed Acknowledged as important Not addressed Continuous three-channel mechanism Weaponsspecific General automation General automation General automation General autonomous systems AWS-specific Empirical validation Conceptual Conceptual Literature review Philosophical analysis 4-phase mixed methods Governance integration Not addressed Not addressed Minimal Central focus (philosophical) Hierarchical constraint architecture Operational tempo consideration Not addressed Not addressed Acknowledged Not addressed Quantified (response latency data) Note. AWS = Autonomous Weapons Systems. The comparison highlights dimensions identified in the literature review as critical for dynamic autonomy management. The DAM framework is the only framework that addresses all eight dimensions with empirical evidence. Table 5.2 reveals that the DAM framework addresses a systematically wider range of design dimensions than any existing framework. Prior frameworks were developed for different purposes—Parasuraman et al. (2000) for general automation design, Sheridan and Verplank (1978) for teleoperation, Endsley (2017) for autonomy research synthesis, and Santoni de Sio and van den Hoven (2018) for philosophical analysis of meaningful human control. None was designed to serve as an operational governance architecture for weapons employment. The DAM 275 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 framework’s unique contribution is not that it replaces these foundational works but that it synthesizes their insights into an integrated, empirically validated framework tailored to the specific demands of autonomous weapons C2. The comparison underscores a fundamental distinction between the DAM framework and its predecessors. Prior frameworks were designed to characterize or classify automation and autonomy in general terms. The DAM framework was designed to govern autonomy transitions in a specific, high-stakes operational domain. This operational specificity—grounded in the weapons employment context’s unique combination of lethal consequences, legal constraints, tempo demands, and accountability requirements—is what distinguishes the DAM framework from generalized autonomy taxonomies. The framework’s empirical validation through four distinct methodological approaches provides a level of evidentiary support that conceptual and philosophical frameworks, however intellectually rigorous, cannot claim. The DAM framework’s relationship to the existing literature should be understood as complementary rather than competitive. Parasuraman et al.’s (2000) information-processing stage model provides the cognitive architecture within which the DAM framework operates. Endsley’s (2017) design principles for human-autonomy interaction inform the framework’s trust calibration and transparency mechanisms. Santoni de Sio and van den Hoven’s (2018) meaningful human control requirements define the ethical standard against which the framework’s governance constraints are evaluated. The DAM framework synthesizes these diverse contributions into an integrated operational architecture—a practical implementation that draws on multiple theoretical traditions while transcending any individual tradition’s limitations. 276 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Implications Implications for Theory The findings of this dissertation carry significant implications for several theoretical domains that intersect in the governance of autonomous weapons systems. Contributions to Human-AI Teaming Theory The trust-accuracy paradox documented in Phase 3—wherein operators reported declining trust despite improving system accuracy across autonomy levels—challenges a foundational assumption of human-AI teaming theory. The predominant models (Lee & See, 2004; Hoff & Bashir, 2015) implicitly or explicitly assume that system performance is the primary determinant of trust. The present findings demonstrate that in weapons employment contexts, perceived agency—the operator’s sense of meaningful participation in the decision process—operates as an independent and potentially dominant trust determinant. This finding suggests that trust models for military human-AI teaming must be expanded to incorporate an agency dimension alongside performance, process, and purpose. The finding also complicates the trust calibration paradigm. If trust is to be “calibrated” such that it matches system capability (the normative prescription of Lee & See, 2004), then the trust-accuracy paradox reveals a scenario in which calibration and acceptance work in opposition: operators may correctly perceive that the system is more accurate at higher autonomy levels while simultaneously experiencing reduced willingness to delegate authority. This “calibration-acceptance gap” has not been previously identified in the trust literature and warrants further theoretical development. Cannon-Bowers et al.’s (1993) shared mental models theory and Cooke et al.’s (2013) interactive team cognition framework both emphasize the importance of shared understanding 277 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 between team members. The present study’s data suggest that shared mental models in humanAI teams are fundamentally asymmetric: the human operator forms a mental model of the AI’s capabilities and intentions, but the AI system’s “model” of the human operator is limited to observable behavioral indicators. This asymmetry is particularly consequential in weapons employment, where the human’s moral reasoning—a component that current AI systems cannot model—is precisely the element that governance frameworks require. Future theoretical work on human-AI teaming must grapple with this fundamental asymmetry rather than assuming symmetric shared cognition. Contributions to Trust Calibration in Autonomous Systems Beyond the trust-accuracy paradox, the present study’s findings contribute to trust theory in several additional ways. The finding that threat tempo did not significantly affect trust scores (F(2, 109) = 2.475, p = .089) while strongly affecting cognitive load (η²p = .67) and decision accuracy (η²p = .28) indicates that trust in autonomous weapons systems is relatively stable across operational conditions, varying primarily as a function of the structural relationship between operator and system (i.e., the autonomy level) rather than the operational context. This stability has important implications for trust engineering: it suggests that trust-building interventions must be embedded in the system’s structural design (its autonomy architecture and transparency features) rather than deployed as context-dependent adjustments. Dietvorst et al.’s (2015) algorithm aversion research demonstrated that people tend to abandon algorithms after observing them err, even when the algorithm outperforms human judges. The present study’s data suggest a more nuanced dynamic in military contexts: operators did not abandon higher-autonomy systems but rather maintained reduced trust while continuing to work within the assigned architecture. This “reluctant compliance” pattern—cooperating with 278 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 but not trusting the system—may characterize military operators who are bound by orders to use the assigned C2 architecture regardless of their trust state. The implications for trust theory are significant: in institutional contexts with hierarchical authority structures, trust and compliance can be decoupled in ways that civilian human-AI interaction research does not typically account for. The implications for the broader debate on levels of automation are equally significant. Endsley (2018) argued that the level of autonomy forms a key aspect of autonomy design, emphasizing that the choice of automation level has cascading effects on human performance, system effectiveness, and overall safety. The present study’s data provide the most comprehensive empirical validation of this argument in any high-stakes domain. The autonomy level variable in Phase 3 produced significant effects on all five dependent variables, with effect sizes ranging from medium (trust, η²p = .37) to very large (response time, η²p = .73). These consistent, large effects confirm that the autonomy level is not merely a design parameter to be optimized but a fundamental architectural decision that shapes every dimension of human-AI team performance. Moreover, the data challenge the implicit assumption in much of the levels-of-automation literature that there exists an optimal automation level for a given task type. The significant interactions between autonomy level and threat tempo for both cognitive load and response time demonstrate that the optimal level is not a fixed property of the task but a dynamic function of the operational context. A level of automation that is optimal under low-tempo conditions (HITL, which maximizes accountability) becomes suboptimal under high-tempo conditions (where HITL’s cognitive load demands degrade oversight quality). This context-dependency provides the theoretical justification for the DAM framework’s dynamic approach: because no 279 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 fixed automation level is universally optimal, the system must dynamically adjust its level in response to changing conditions. Contributions to C2 Theory in the Age of AI The findings have significant implications for command and control theory as it adapts to the age of AI. The speed-accountability tradeoff quantified in this study provides the first empirical parameterization of a tension that C2 theorists have long recognized but never measured. Boyd’s (1996) OODA loop framework implicitly assumed that faster C2 tempo is always advantageous, subject only to the quality of the information feeding the decision cycle. The present study demonstrates that C2 tempo in human-AI systems is constrained by a governance dimension—accountability—that is orthogonal to information quality. A commander who increases C2 tempo by shifting to HOVL gains speed but loses accountability at a rate that is quantifiable (approximately 1 percentage point of accountability per 0.25-second reduction in response latency, based on the ABM data). Alberts and Hayes’s (2003, 2006) C2 agility framework proposed that effective C2 must be able to operate across a continuum of C2 approaches, from fully centralized to fully distributed. The DAM framework’s three-tier autonomy spectrum (HITL/HOTL/HOVL) with dynamic transitions maps directly onto this continuum, with HITL representing maximal centralization of engagement authority and HOVL representing maximal distribution. The contribution of the present study is to demonstrate empirically that transitions along this continuum carry measurable costs and benefits that must be actively managed, not merely permitted. C2 agility, in the autonomous weapons context, requires not just the capability to operate at different points on the centralization-distribution spectrum but an active management system that optimizes the tradeoffs in real time. 280 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Contributions to Meaningful Human Control Discourse The international discourse on meaningful human control over autonomous weapons, centered in the Convention on Certain Conventional Weapons negotiations, has been primarily philosophical and legal (Santoni de Sio & van den Hoven, 2018; Roff & Moyes, 2016; Ekelhof, 2019; Cavalcante Siebert et al., 2023). The present study injects empirical evidence into this discourse by demonstrating that meaningful human control is not a binary state but a continuous variable that degrades predictably as autonomy increases. The specific quantification—HITL maintaining 97.8% accountability integrity, HOTL 86.3%, HOVL 68.2%—provides the governance community with data points against which to evaluate proposed control standards. Critically, the data demonstrate that the most common proposed governance standard— “meaningful human control” without further specification—is insufficiently precise for operational implementation. Meaningful human control as measured by accountability chain integrity ranges from 68.2% to 97.8% across architectures; as measured by operator trust, it ranges from 3.77 to 5.42 on a 7-point scale; as measured by ROE compliance, it ranges from 85.15% to 91.68%. The choice of metric fundamentally determines what level of control is deemed “meaningful.” The DAM framework’s contribution to this discourse is to demonstrate that governance standards must specify which dimensions of human control they prioritize and what minimum thresholds they require, rather than invoking meaningful human control as an undifferentiated concept. Implications for Military Practice and Operations The findings translate into specific, actionable recommendations for military practice across several domains. 281 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 C2 Architecture Design The comparative architecture data support adopting HOTL as the default C2 architecture for autonomous weapons employment, with structured provisions for transitioning to HITL or HOVL based on defined conditions. This recommendation is grounded in HOTL’s demonstrated balance across all performance dimensions: 86.3% mission success, 86.3% accountability integrity, 2.70-second response latency (ABM), and the best composite experimental performance across decision accuracy, trust, cognitive load, and ROE compliance. C2 system designers should engineer autonomous weapons platforms with all three operating modes available and implement the transfer-of-control trigger system specified by the DAM framework. The sensitivity analysis finding that system accuracy is the most influential parameter on mission success (9.7-percentage-point swing) indicates that the single most impactful investment for improving dynamic autonomy management outcomes is improving the accuracy and reliability of the autonomous system’s targeting and decision algorithms. This prioritization should inform acquisition and development planning: investments in AI system reliability will yield greater returns than investments in more sophisticated autonomy management interfaces. Training Implications The cognitive load findings—particularly the significant autonomy × tempo interaction (η²p = .16)—indicate that operators working with dynamically adjusting autonomy systems face unique cognitive demands. Training programs must be developed that explicitly address: (a) mental model development for three distinct operating modes and the transitions between them, (b) trust calibration skills that enable operators to maintain appropriate reliance despite the documented trust-accuracy paradox, (c) cognitive load management techniques for high-tempo scenarios where the temptation to over-delegate to higher autonomy levels is strongest, and (d) 282 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 accountability awareness—ensuring that operators understand their accountability obligations across all autonomy levels and the documentation requirements of each transition. The Phase 4 expert feedback that training requirements represent a significant implementation consideration (cited by 12 of 18 experts) underscores the need for dedicated institutional investment in training infrastructure. The unique cognitive demands of dynamic autonomy management suggest that traditional automation training, which focuses on system operation and fault management, is insufficient. Operators require training in what might be termed “autonomy management metacognition”—the ability to monitor and manage their own cognitive state, trust calibration, and decision quality as the autonomy level changes around them. Doctrine Development The findings support several specific doctrinal recommendations. First, the U.S. Army’s ADP 6-0 (Mission Command) should be updated to explicitly address the delegation of authority to autonomous systems, extending the mission command concept to human-AI teams. The DAM framework’s approach—defining governance parameters within which the autonomous system exercises delegated authority—provides the doctrinal template for this extension. Second, joint publications governing targeting (JP 3-60) and rules of engagement should incorporate provisions for dynamic autonomy transitions, specifying the conditions under which engagement authority may shift between human and AI decision-makers and the accountability requirements for each transition. Third, service-specific publications on autonomous systems employment should adopt the three-tier autonomy spectrum and transfer-of-control trigger taxonomy as standardized terminology and operational constructs. 283 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Force Design and Operational Planning The response latency data have direct implications for force design and operational planning. Engagement scenarios that require sub-2-second response times (e.g., close-in defense against hypersonic or saturation attacks) may necessitate HOVL operation with pre-authorized engagement parameters, while scenarios permitting longer decision windows can maintain HITL or HOTL for enhanced accountability. Force designers should ensure that autonomous weapons platforms are equipped to operate across the full autonomy spectrum and that C2 networks provide the bandwidth and latency required to support real-time autonomy transitions. Operational planners should incorporate autonomy management into their planning processes, pre-determining the conditions under which autonomy transitions will be authorized and ensuring that operators are briefed on the expected autonomy profile for each mission phase. The implications for intelligence preparation of the operating environment (IPOE) and targeting processes are particularly noteworthy. Current joint targeting doctrine (JP 3-60) assumes human decision-makers at each step of the targeting cycle: target development, target validation, weaponeering, force assignment, mission planning, and mission execution. The DAM framework suggests that the degree of human involvement at each step should vary based on target characteristics, time sensitivity, and collateral damage risk. High-collateral-damagepotential targets should be processed under HITL authority regardless of time pressure, while time-sensitive targets in low-collateral-risk environments may be processed under HOTL or, in extremis, HOVL authority. This differentiated approach—already implicit in some current targeting practices—should be formalized in doctrinal guidance. The implications for multi-domain operations merit particular attention. As the joint force transitions toward Joint All-Domain Operations, commanders will increasingly face situations 284 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 where autonomous systems in different domains (air, maritime, land, space, cyber) must coordinate engagement decisions at speeds exceeding human cognitive capacity for individual oversight. The DAM framework’s governance constraint architecture—which defines permissible autonomy ranges through hierarchically structured rules of engagement—provides a template for managing this complexity. Rather than attempting to maintain human oversight of every individual engagement decision across every domain, the framework enables commanders to set governance parameters that constrain autonomous action within acceptable boundaries while delegating execution authority to AI systems operating at machine speed. The manpower and personnel implications deserve consideration as well. The DAM framework’s HOTL default mode requires operators who can effectively supervise autonomous system behavior, recognize situations requiring intervention, and execute interventions with appropriate speed and accuracy. This operator profile differs significantly from both the traditional weapons operator (who actively controls every system function) and the passive monitor (who observes but rarely intervenes). The emerging role might be termed an “autonomy manager”—a specialist trained in supervisory control of autonomous weapons systems, trust calibration, and dynamic governance management. Military personnel systems should begin developing the occupational specialties, training pipelines, and career progression pathways needed to produce effective autonomy managers in sufficient numbers to support the expanding fleet of autonomous weapons systems. Implications for Policy and Senior Leadership The findings carry immediate implications for senior military leadership and defense policymakers. The following recommendations are directed to the Joint Chiefs of Staff and senior leaders of the joint military industrial base. 285 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Recommendations for the Joint Chiefs of Staff The Joint Chiefs of Staff should direct the development of a joint policy framework for dynamic autonomy management in autonomous weapons systems, drawing on the DAM framework as an empirically grounded starting point. This framework should establish standardized terminology for autonomy levels and transitions, define minimum accountability chain integrity thresholds for each operational context, and mandate the implementation of transfer-of-control protocols in all autonomous weapons systems. The framework should be incorporated into the Joint Force Development process and reflected in updates to the Universal Joint Task List. The speed-accountability tradeoff quantified in this study—the 30.3-percentage-point decline in accountability from HITL to HOVL—demands that senior leadership establish clear policy guidance on acceptable accountability thresholds. The data suggest that HOTL’s 86.3% accountability integrity represents a viable threshold for most operational contexts, but timecritical defensive scenarios may require accepting the reduced accountability of HOVL operations. These policy decisions cannot be delegated to individual operators or tactical commanders; they require authoritative guidance from the highest levels of military leadership. Implications for DoDD 3000.09 Revision The present study’s findings identify several areas where DoD Directive 3000.09 would benefit from revision. The Directive’s binary classification of autonomous and semi-autonomous weapons systems does not accommodate the continuous, dynamic transitions between autonomy levels that the DAM framework demonstrates are operationally necessary. The Directive should be amended to recognize a spectrum of autonomy operating modes and to establish governance requirements for transitions between them. Additionally, the Directive’s testing and evaluation 286 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 requirements should be updated to include accountability chain integrity testing across autonomy transitions, trust calibration assessment, and cognitive load evaluation under operational tempo conditions. Defense Industrial Base Recommendations The defense industrial base should incorporate dynamic autonomy management capabilities as a standard design requirement for autonomous weapons systems. Specifically, systems should be designed with configurable autonomy levels (supporting HITL, HOTL, and HOVL operating modes), built-in accountability logging at every autonomy transition, transparency interfaces that provide performance, process, and governance information calibrated to operator needs, and standardized autonomy state interfaces that enable interoperability between platforms from different manufacturers. The sensitivity analysis finding that system accuracy is the dominant parameter affecting mission success should inform R&D priorities: improving targeting algorithm reliability will yield the greatest returns for dynamic autonomy management effectiveness. Acquisition and procurement policy recommendations flow directly from the technical findings. The Defense Acquisition System (DAS) should incorporate dynamic autonomy management requirements into the capability development documents (CDDs) and capability production documents (CPDs) for all autonomous weapons programs. Specifically, requirements should mandate: (a) multi-mode autonomy capability supporting HITL, HOTL, and HOVL operations, (b) standardized autonomy state interfaces conforming to joint interoperability standards, (c) accountability logging that records every autonomy transition with timestamps, authorization sources, and contextual data, (d) trust-supporting transparency interfaces providing 287 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 performance, process, and governance information, and (e) governance constraint implementation that enforces hierarchical rules of engagement compliance. The testing and evaluation implications are equally significant. The Director of Operational Test and Evaluation (DOT&E) should develop standardized test protocols for dynamic autonomy management capabilities, including: accountability chain integrity testing across autonomy transitions, trust calibration assessment with representative operator populations, cognitive load evaluation under operationally realistic tempo conditions, and governance constraint enforcement verification under adversarial conditions. The Phase 4 expert panel included a DOT&E test and evaluation director whose rating (overall mean of 5.8/7 across criteria) suggests that the testing community recognizes the feasibility and importance of these evaluations. Cost-benefit considerations favor the integration of dynamic autonomy management capabilities into autonomous weapons from the design stage rather than retrofitting them later. The marginal cost of designing a system with multi-mode autonomy capability—essentially, implementing three operating modes rather than one—is substantially lower than the cost of modifying fielded systems. Moreover, the accountability logging and transparency features required by the DAM framework serve dual purposes: they satisfy governance requirements and provide the operational data needed for system performance optimization, maintenance prediction, and training effectiveness assessment. This dual utility strengthens the business case for incorporating DAM capabilities into initial system design. International Governance Implications The findings inform the international governance discourse on autonomous weapons in several ways. For the Convention on Certain Conventional Weapons negotiations, the DAM 288 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 framework’s quantification of meaningful human control across architectures provides empirical data to ground what has been a primarily conceptual debate. The ICRC’s (2021) call for limitations on autonomous weapons based on the type and degree of human control can now be informed by specific performance data: the 68.2% accountability integrity under HOVL may fall below acceptable thresholds, while HOTL’s 86.3% may provide a defensible governance standard. For NATO alliance management, the framework’s standardized autonomy levels and transition protocols provide a template for interoperability standards that would enable allied forces to operate with shared governance expectations. The NATO AI strategy’s (2024) principles of responsible AI use in defence could be operationalized through the DAM framework’s hierarchical governance constraint architecture, ensuring that allied autonomous weapons systems operate within common accountability boundaries. The acquisition policy landscape deserves additional attention. The current defense acquisition system does not include standardized requirements for dynamic autonomy management capabilities. Individual program offices make ad hoc decisions about the degree of human control to be implemented, often without the benefit of the kind of empirical data generated by this study. The result is an inconsistent patchwork of autonomy implementations across weapons platforms, complicating interoperability, training, and governance oversight. A standardized set of dynamic autonomy management requirements, incorporated into the Joint Capabilities Integration and Development System (JCIDS) process, would ensure that all future autonomous weapons systems are designed with the governance capabilities that the DAM framework identifies as essential. 289 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The implications for the ongoing debate about the U.S. position in international autonomous weapons negotiations are also noteworthy. The United States has consistently opposed a preemptive ban on autonomous weapons, arguing that such systems can be developed and employed in compliance with international humanitarian law. The DAM framework’s empirical evidence strengthens this position by demonstrating that governance architectures can be designed to maintain meaningful human control while achieving the operational benefits of autonomous systems. However, the data also reveal that some operating modes (particularly HOVL with its 68.2% accountability integrity) may fall below the thresholds that the international community would consider acceptable. The U.S. negotiating position would be strengthened by proposing governance standards grounded in the kind of empirical evidence this study provides, rather than relying solely on assurances of responsible development. Table 5.3 Policy Recommendations Summary Recommendation Rationale Target Audience Priority Establish joint dynamic autonomy management policy framework No standardized governance exists for dynamic autonomy transitions in AWS Joint Chiefs of Staff; Combatant Commands Critical Define minimum accountability chain integrity thresholds Accountability degrades predictably with autonomy (97.8% to 68.2%) OSD Policy; Service Secretariats Critical Revise DoDD 3000.09 to recognize dynamic autonomy spectrum Binary autonomous/semiautonomous classification insufficient USD(R&E); USD(A&S) High Mandate accountability logging in all AWS acquisition programs Traceability rated highest validation criterion (M = 5.83/7) USD(A&S); Defense Industrial Base High Develop dynamic autonomy training curricula Unique cognitive demands require specialized training; 12/18 experts flagged need Service Training Commands; PME Institutions High Incorporate DAM framework into JADC2 architecture Framework provides governance template for multi-domain autonomous operations Joint Staff J6; Service C2 Programs Medium 290 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Propose DAM-based standards for NATO interoperability Standardized autonomy levels enable coalition autonomous operations NATO Allied Command Transformation Medium Fund scalability research for multiplatform autonomy management Scalability rated lowest validation criterion (M = 4.72/7) DARPA; Service Research Labs Medium Contribute empirical findings to CCW LAWS negotiations Data-informed governance standards more defensible than conceptual proposals DoS; OSD Policy; U.S. CCW Delegation Medium Establish adversarial robustness testing for autonomy governance Adversarial manipulation of autonomy triggers identified as vulnerability DOT&E; Service Test Centers High Note. AWS = Autonomous Weapons Systems; OSD = Office of the Secretary of Defense; USD(R&E) = Under Secretary of Defense for Research and Engineering; USD(A&S) = Under Secretary of Defense for Acquisition and Sustainment; PME = Professional Military Education; JADC2 = Joint All-Domain Command and Control; CCW = Convention on Certain Conventional Weapons; LAWS = Lethal Autonomous Weapons Systems; DOT&E = Director, Operational Test and Evaluation. Limitations No research design is without limitations, and the present study’s scope and complexity introduce several constraints that must be transparently acknowledged. These limitations are organized into methodological, scope, and mitigation categories. Methodological Limitations Simulated Versus Real Experimental Conditions The most significant methodological limitation is the use of simulated experimental conditions rather than live operational testing with actual military operators making real weapons employment decisions. Phase 3’s 118 simulated participants, while generated using statistically rigorous methods calibrated to published human performance parameters, cannot fully replicate the psychological, physiological, and moral dimensions of actual weapons employment decisions. The stress, consequences, and organizational pressures of real combat decision291 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 making introduce variables—combat stress, moral injury risk, command pressure, fatigue over extended operations—that no simulation can fully capture. This limitation is shared with virtually all research in this domain, as Pokorny’s (2026) systematic literature review found that existing research is “heavily weighted toward laboratory experiments and simulations rather than real-world deployments” due to operational secrecy and ethical constraints. The practical impossibility of conducting controlled experiments with actual lethal autonomous weapons employment necessitates simulation-based approaches, but the gap between simulated and operational conditions must be acknowledged as a constraint on external validity. Use of Publicly Available and Unclassified Data Only The exclusive reliance on publicly available, unclassified data sources constitutes a significant constraint. Classified program data, operational reports from autonomous weapons employment, and internal military assessment of C2 architectures would provide a richer empirical foundation but are inaccessible to academic research. The Phase 1 document corpus of 84 publicly available documents represents the perspectives that institutional actors are willing to express in public settings, which may diverge from private assessments. Classified weapons system performance parameters may differ substantially from the publicly available data used to calibrate the ABM, potentially affecting the precision of the quantitative findings. However, this limitation also represents a deliberate methodological choice with important advantages. The use of exclusively unclassified data ensures that the research, the DAM framework, and all findings can be widely disseminated, openly debated, and independently replicated. A framework based on classified data would be inaccessible to the broader scholarly community, the international governance discourse, and the defense industrial 292 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 base partners who must implement its recommendations. The tradeoff between data richness and dissemination accessibility was resolved in favor of the latter, consistent with the democratic accountability imperatives of autonomous weapons governance. Agent-Based Model Simplifications The Phase 2 ABM necessarily simplified the complexity of real-world engagement scenarios. The model represented human decision-making through probabilistic response functions rather than cognitive architectures, reduced rules of engagement to compliance probabilities rather than the nuanced legal reasoning they require, and simulated engagement scenarios in a stylized rather than geographically specific operational environment. These simplifications are inherent to computational modeling and are consistent with established ABM methodology in defense research (Ilachinski, 2004; Moffat, 2011), but they constrain the precision with which the model’s quantitative outputs can be extrapolated to specific operational contexts. The model’s treatment of system accuracy as a fixed parameter within each simulation run represents a further simplification. In real-world autonomous weapons employment, system accuracy varies dynamically based on environmental conditions (weather, terrain, electronic warfare effects), target characteristics (size, movement patterns, camouflage), and system degradation over time (sensor contamination, software errors, communication latency). A more sophisticated model would implement accuracy as a dynamic variable responsive to contextual factors, but the added complexity would substantially increase the model’s parameter space and the number of Monte Carlo iterations needed for convergence. The present approach—treating accuracy as a fixed parameter and exploring its effects through sensitivity analysis—represents a 293 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 defensible simplification that captures the first-order effects while acknowledging the need for more dynamic modeling in future research. The Phase 4 tabletop exercise, while providing valuable expert validation, involved a single structured session rather than the iterative multi-session format recommended by RAND Corporation for comprehensive policy analysis. Budget and access constraints limited the study to one evaluation session per expert. A more robust validation would involve multiple sessions with iterative framework refinement between sessions, enabling experts to evaluate successively refined versions of the framework. The present study’s single-session approach may have captured initial expert impressions rather than the more considered judgments that emerge through iterative engagement. However, the high inter-rater agreement (ICC = 0.73) and the consistency of the quantitative and qualitative validation results suggest that the single-session format produced reliable assessments. Generalizability Constraints The findings are most directly applicable to the types of engagement scenarios simulated in the study: tactical-level, single-platform engagement decisions in contested environments. Generalization to other operational contexts—strategic-level command decisions, multi-domain operations, extended campaign durations, or scenarios involving weapons of mass destruction— requires caution and further empirical investigation. The Phase 4 expert validation provides some evidence of broader applicability (the framework received positive ratings from experts across multiple domains), but the specific quantitative parameters derived from the ABM and experimental phases are calibrated for tactical engagement scenarios. 294 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Scope Limitations Focus on U.S. Military Context The study was designed within and is most directly applicable to the U.S. military context. The Phase 1 document corpus drew primarily from U.S. government sources (congressional testimony, GAO reports, CRS reports, DoD Directives), and the governance constraint architecture of the DAM framework is structured around U.S. policy instruments (DoDD 3000.09) and U.S. doctrinal concepts (mission command, JADC2). While the underlying principles of dynamic autonomy management are transferable across military contexts, the specific parameters, governance structures, and doctrinal alignments require adaptation for allied or partner nations with different C2 traditions, legal frameworks, and organizational cultures. Temporal Limitations The study reflects the state of autonomous weapons technology, policy, and doctrine as of 2026. The rapid pace of AI development means that the technological capabilities assumed in the ABM and experimental scenarios may be substantially exceeded within a few years. The governance frameworks referenced (DoDD 3000.09, NATO AI principles, CCW negotiations) are actively evolving. The DAM framework’s modular design is intended to accommodate technological evolution, but its specific parameters will require periodic recalibration as capabilities and policy contexts change. Mitigation Strategies Employed The mixed-methods sequential design was specifically chosen to mitigate the inherent limitations of any single methodological approach. The four-phase structure provides methodological triangulation: findings that converge across qualitative document analysis, computational simulation, human experimentation, and expert validation are substantially more 295 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 robust than findings from any single method. The central findings—the speed-accountability tradeoff, HOTL’s optimal positioning, the trust-accuracy paradox—were confirmed across at least three of the four phases, providing strong confidence in their validity despite the limitations of each individual phase. The Phase 4 tabletop exercise validation provided a critical additional layer of robustness by subjecting the framework to expert scrutiny from 18 professionals spanning military operations, policy, industry, academia, and congressional oversight. The experts’ identification of both strengths and weaknesses confirms that the framework was evaluated rigorously rather than superficially endorsed. The ICC(C,k) of 0.73, indicating good inter-rater agreement, provides statistical evidence that the expert evaluations reflect genuine assessment of framework quality rather than idiosyncratic individual opinions. The exclusive use of publicly available data, while limiting data richness, ensures complete replicability. Every data source, analytical procedure, and parameter value is documented and accessible, enabling independent researchers to reproduce the analysis, challenge the findings, and extend the framework. This transparency is itself a mitigation strategy, as it enables the scholarly community to identify and address limitations that the present study may have overlooked. Recommendations for Future Research The limitations and emergent findings of this dissertation define a rich research agenda that builds on the DAM framework’s empirical foundation. 296 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Extending the DAM Framework Live Field Testing with Military Operators The highest-priority future research direction is the validation of the DAM framework with actual military operators in realistic training environments. This would involve adapting the Phase 3 experimental design for use in military simulation centers (e.g., the Joint National Training Center, the Navy’s Fleet Synthetic Training environment, or the Air Force’s Distributed Mission Operations network) with active-duty personnel making engagement decisions under conditions approximating operational stress. Such testing would address the most significant limitation of the present study by introducing the psychological and organizational variables that simulated data cannot capture. Specific experimental protocols for live field testing should include: (a) a betweensubjects comparison of the three C2 architectures with active-duty operators in high-fidelity simulation environments, replicating the Phase 3 design with operationally experienced participants; (b) a within-subjects assessment of dynamic autonomy transitions, in which operators experience real-time shifts between HITL, HOTL, and HOVL modes during extended engagement sequences, measuring the cognitive costs of transition; (c) team-level evaluation, expanding beyond the individual operator-system dyad to assess how dynamic autonomy management functions within multi-person command teams where different operators may have different autonomy management responsibilities; and (d) stress inoculation testing, in which operators practice dynamic autonomy management under progressively increasing stress levels to assess the framework’s robustness to combat stress effects. These protocols would generate the operational validation data needed to refine the DAM framework’s parameters for fielded systems. 297 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Longitudinal Trust Evolution Studies The present study’s trust measurements captured a cross-sectional snapshot of trust at a single point in operator-system interaction. Longitudinal studies tracking trust evolution over extended periods of human-AI collaboration in autonomous weapons contexts—ideally spanning weeks or months of training and simulated operations—would reveal how trust calibration changes with experience, whether the trust-accuracy paradox attenuates with familiarization, and how trust repair mechanisms function over repeated cycles of system success and failure. Such studies would directly address the cross-cutting methodological gap (MC-2) identified in the research gaps analysis. Cross-Cultural and Coalition Validation The U.S.-centric focus of the present study limits the framework’s applicability to coalition operations. Future research should validate the DAM framework with allied nation military personnel—particularly NATO partners, Five Eyes allies, and key Indo-Pacific partners—to assess how different military cultures, C2 traditions, and legal frameworks affect the framework’s performance. Cross-cultural trust dynamics (Chien et al., 2014) may significantly alter the trust calibration parameters, and different doctrinal traditions (e.g., the German Auftragstaktik tradition, the British mission command variant) may require adaptation of the transfer-of-control protocols. Emerging Research Directions LLM-Mediated C2 and Generative AI in Autonomous Weapons The emergence of large language models (LLMs) as C2 decision support tools (Strouse et al., 2024; Burdette et al., 2026) introduces a fundamentally new dimension to dynamic autonomy management. LLMs could serve as intermediary agents between human commanders and 298 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 autonomous weapons systems, translating natural language commander’s intent into machineexecutable engagement parameters, generating real-time explanations of autonomous system behavior, and providing course-of-action recommendations that incorporate both tactical analysis and legal reasoning. Research is urgently needed on how LLM-mediated C2 affects the trust dynamics, accountability structures, and autonomy transition mechanisms characterized in this study. Swarm Autonomy Management The scalability limitation identified in Phase 4 becomes acute in the context of swarm operations involving dozens or hundreds of autonomous platforms. Swarm autonomy management introduces qualitatively different challenges: how to maintain meaningful human control when the number of platforms exceeds human supervisory capacity, how to distribute accountability across swarm-level and individual-platform decisions, and how to implement transfer-of-control protocols for a collective rather than an individual system. The DAM framework’s single-platform architecture must be fundamentally rethought for swarm contexts, drawing on the swarm intelligence literature (Scharre, 2014; DARPA, 2017) and extending the framework’s governance principles to emergent collective behavior. Adversarial AI and Counter-Autonomy The Phase 4 expert concern about adversarial robustness points to a critical research direction. Adversaries will inevitably seek to exploit dynamic autonomy management systems— by manipulating sensor inputs to trigger inappropriate autonomy transitions, by targeting the communication links that enable human oversight, or by creating operational conditions designed to force opponents into governance-constrained modes that degrade their operational effectiveness. Research on adversarial robustness of autonomy governance—drawing on 299 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 adversarial machine learning, cybersecurity, and electronic warfare disciplines—is essential for ensuring that the DAM framework functions under contested conditions. Neurobiological Trust Metrics The systematic literature review identified the neurobiological foundations of trust in human-AI teaming as entirely unexplored (Gap MC-3). Future research should employ neuroimaging techniques (functional magnetic resonance imaging, electroencephalography) and neurophysiological measures (galvanic skin response, pupillometry, heart rate variability) to investigate the neural correlates of trust formation, calibration, and breakdown in autonomous weapons oversight. Such research could enable real-time trust state monitoring that feeds directly into the DAM framework’s trust calibration mechanism, enabling autonomy adjustments triggered by objectively measured trust states rather than behavioral proxies. Methodological Advances Digital Twin Approaches for C2 Testing Digital twin technology offers a promising methodological advance for dynamic autonomy management research. A digital twin of an operational C2 environment could enable continuous testing of autonomy management parameters against real-world operational data streams, bridging the gap between simulation-based research and operational deployment. Future research should develop digital twin architectures for autonomous weapons C2 and validate their fidelity against live operational data. Real-Time Trust Measurement in Operational Settings The development of unobtrusive, real-time trust measurement instruments for operational settings remains an important methodological challenge. Future research should develop and validate multi-modal trust measurement systems that combine physiological indicators (eye 300 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 tracking, skin conductance, heart rate variability), behavioral indicators (override frequency, query patterns, engagement timing), and ecological momentary assessment methods adapted for the military operational context. Such instruments would enable the continuous trust calibration mechanism proposed in the DAM framework to function on empirically measured trust states rather than estimated proxies. Table 5.4 Future Research Agenda Research Direction Methodology Timeline Expected Contribution Live field testing with military operators Experimental; military simulation centers; N ≥ 200 active-duty 2–3 years Operational validation of DAM parameters with real decision-makers Longitudinal trust evolution studies Longitudinal; 12+ weeks; mixed methods 3–4 years Trust trajectory models for sustained human-AI weapons operations Cross-cultural/coalition validation Experimental; NATO/ Five Eyes participants; cross-cultural design 2–4 years Culturally adapted DAM parameters for coalition operations LLM-mediated C2 research Experimental; LLM integration with C2 simulation environments 1–2 years Governance framework for generative AI in weapons C2 Swarm autonomy management ABM + experimental; multi-platform simulations 3–5 years Scaled DAM framework for collective autonomous operations Adversarial robustness testing Red team/blue team; adversarial ML; cyber-EW scenarios 2–3 years Hardened governance mechanisms against adversarial exploitation Neurobiological trust metrics Neuroimaging; EEG; physiological measures in military contexts 3–5 years Real-time neural trust state monitoring for autonomy calibration Digital twin C2 testing Computational; digital twin development; real-data validation 2–4 years Continuous governance testing against live operational data Real-time trust measurement systems Engineering; multimodal sensor fusion; field validation 2–3 years Unobtrusive operational trust measurement for DAM integration Note. DAM = Dynamic Autonomy Management; LLM = Large Language Model; ABM = Agent-Based Model; EEG = Electroencephalography; EW = Electronic Warfare; ML = Machine Learning. 301 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table 5.5 Summary of Findings by Research Question Research Question Key Finding Phase Evidence Strength RQ1 Dynamic allocation should be contextdependent, defaulting to HOTL with condition-triggered transitions to HITL/HOVL Phases 1–4 Strong (4-phase convergence) RQ1 Speed-accountability tradeoff quantified: HITL 97.8% accountability / 8.51s latency; HOVL 68.2% accountability / 1.20s latency Phase 2 Strong (13,500 iterations) RQ1 Three categories of transfer-of-control triggers: condition-based, event-based, operatorinitiated Phase 1 Moderate (84 docs) RQ2 Trust-accuracy paradox: HOVL achieves 85.7% accuracy but only 3.77/7 trust vs. HITL’s 78.7% accuracy and 5.42/7 trust Phase 3 Strong (η²p = .37) RQ2 Cognitive load interaction: HITL cognitive load surges to 71.47 under high tempo vs. HOVL’s 36.12; autonomy × tempo interaction η²p = .16 Phase 3 Strong (p < .001) RQ2 Transfer-of-control protocols must be triggered by objective conditions, not solely operator judgment Phases 2–4 Moderate (cross-phase) RQ3 No single C2 architecture optimizes all dimensions; architecture selection involves unavoidable tradeoffs Phases 2–3 Strong (MANOVA p < .001) RQ3 HOTL optimal default: 86.3% mission success, 86.3% accountability, 2.70s latency; best composite across all measures Phases 2–4 Strong (4-phase convergence) RQ3 Traceability rated highest by experts (M = 5.83/7); scalability rated lowest (M = 4.72/7) Phase 4 Moderate (N = 18) RQ3 DAM framework received positive validation across all 5 criteria; all means significantly above neutral (p < .05) Phase 4 Moderate (N = 18) 302 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. RQ = Research Question; HITL = Human-in-the-Loop; HOTL = Human-on-the-Loop; HOVL = Human-overthe-Loop; MANOVA = Multivariate Analysis of Variance. Evidence strength assessed based on effect sizes, sample sizes, and cross-phase convergence. Conclusions This dissertation set out to address a critical gap at the intersection of military science, artificial intelligence, and national security: the absence of an empirically validated framework for dynamically managing the allocation of decision authority between human commanders and autonomous weapons systems. Through a rigorous four-phase sequential mixed-methods investigation, the study has produced the Dynamic Autonomy Management (DAM) framework—the first empirically grounded governance architecture for human-AI command and control in autonomous weapons employment. The research has demonstrated that dynamic autonomy management is both operationally necessary and practically achievable. The qualitative analysis of 84 policy, doctrinal, and analytical documents confirmed that the governance of autonomous decision authority is the central preoccupation of the institutions responsible for autonomous weapons policy, yet the operational mechanisms for implementing that governance remain underdeveloped. The agentbased computational model, grounded in the qualitative findings and validated through 13,500 Monte Carlo iterations, quantified the fundamental speed-accountability tradeoff that constrains all autonomous weapons C2 design: HITL maintains 97.8% accountability chain integrity at the cost of 8.51-second response latency, while HOVL achieves 1.20-second response times but with accountability degrading to 68.2%. HOTL emerges as the optimal default, balancing 86.3% accountability with 2.70-second response times and 86.3% mission success. 303 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The experimental phase confirmed these computational predictions with human participants, adding critical dimensions of trust and cognitive load that computational models cannot capture. The trust-accuracy paradox—the novel finding that operators report declining trust despite improving system accuracy as autonomy increases—challenges fundamental assumptions of human-AI teaming theory and has profound implications for how autonomous weapons systems are designed, governed, and operated. The cognitive load interaction effects demonstrate that the conditions under which human oversight is most needed are precisely the conditions under which human cognitive resources are most constrained—an empirical confirmation of Bainbridge’s (1983) ironies of automation in the most consequential domain imaginable. Expert validation by 18 defense professionals confirmed the framework’s operational viability while identifying scalability as the primary implementation challenge. The validators’ strongest endorsement—of the framework’s decision traceability mechanisms (M = 5.83/7)— affirms that the DAM framework’s approach to accountability addresses a recognized gap in current autonomous weapons governance. The framework’s alignment with mission command doctrine (M = 5.50/7) enhances its prospects for institutional adoption within the U.S. military establishment. The significance of this research extends beyond the academic contribution. The Joint Chiefs of Staff and senior leaders of the joint military industrial base face an increasingly urgent imperative: the United States and its adversaries are developing and deploying autonomous weapons systems at a pace that outstrips the governance frameworks available to manage them. Every autonomous weapons system fielded without a rigorous dynamic autonomy management framework introduces risk—risk that engagement decisions will lack adequate accountability, 304 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 that operators will miscalibrate their trust in autonomous systems with potentially catastrophic consequences, and that the speed of autonomous operations will outpace the governance structures meant to constrain them. The DAM framework provides a principled, empirically grounded response to these risks. The research has demonstrated that meaningful human control over autonomous weapons is not a binary property to be either preserved or surrendered but a continuous variable that can be actively managed through deliberate governance design. The accountability-autonomy tradeoff is real and quantifiable, but it is not intractable. Through the DAM framework’s combination of context-sensitive autonomy allocation, structured transfer-of-control protocols, continuous trust calibration, and hierarchical governance constraints, military forces can maintain meaningful human control while achieving the operational tempo demanded by modern warfare. This is not a theoretical possibility—it is an empirically supported design architecture. The urgency of implementation cannot be overstated. The convergence of advancing AI capabilities, accelerating autonomous weapons proliferation, great power competition, and evolving international norms creates a narrow window during which the United States can shape the governance architecture for autonomous weapons employment. Frameworks established now will set precedents that endure for decades, shaping not only how the United States employs autonomous weapons but how the international community governs them. The DAM framework, grounded in four phases of empirical research and validated by experts spanning military operations, policy, industry, and academia, provides the evidentiary foundation for those governance decisions. As autonomous systems assume ever-greater roles in the defense of the nation, the question is not whether to grant them decision authority but how to govern that authority in a 305 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 manner that preserves the accountability, ethical standards, and operational effectiveness upon which American military power depends. This dissertation has demonstrated that dynamic autonomy management offers a rigorous, empirically validated answer to that question. The responsibility now lies with military leaders, policymakers, and the defense industrial base to translate this research into the governance architectures, training programs, and system designs that will determine how the United States and its allies navigate the most consequential military technological transformation since the advent of nuclear weapons. The broader strategic context reinforces this urgency. China’s military modernization includes substantial investment in autonomous weapons capabilities, guided by a strategic culture that may prioritize operational effectiveness over the accountability constraints that democratic nations impose on their military forces (Allen, 2019). Russia’s autonomous weapons development proceeds with fewer of the institutional checks that the U.S. governance framework provides (Bendett, 2017). In this competitive environment, the United States cannot afford to sacrifice operational effectiveness through overly restrictive governance, nor can it abandon the accountability standards that distinguish democratic military forces from their authoritarian competitors. The DAM framework threads this needle by demonstrating empirically that meaningful human control and operational effectiveness are not mutually exclusive—but achieving both simultaneously requires deliberate, sophisticated governance design of the kind this dissertation provides. The research community’s response to this work will determine its ultimate impact. The DAM framework is presented not as a finished product but as an empirically grounded starting point that invites extension, critique, and refinement. The live field testing, longitudinal trust studies, cross-cultural validation, and scalability research identified in the future directions 306 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 represent not optional supplements but essential next steps. The framework’s current limitations—simulated rather than real experimental data, single-platform rather than multiplatform scope, U.S.-centric rather than coalition-oriented design—define the boundaries of what has been demonstrated, not the boundaries of what is possible. Future researchers, building on the empirical foundation established here, will extend the framework’s applicability and refine its parameters. What this dissertation has established beyond reasonable doubt is that dynamic autonomy management is empirically tractable, operationally viable, and strategically essential. In the final analysis, this dissertation speaks to a question that transcends any single governance framework or empirical finding: What kind of military power does the United States seek to wield in the age of artificial intelligence? A military that deploys autonomous weapons without rigorous governance invites catastrophic accountability failures. A military that refuses to leverage autonomous capabilities surrenders decisive operational advantages to adversaries less constrained by democratic values. The Dynamic Autonomy Management framework offers a third path—one that preserves the moral and legal foundations of American military power while harnessing the operational capabilities that autonomous weapons provide. The path is narrow, demanding constant calibration of competing imperatives. But as this research demonstrates, it is navigable—and the stakes of failing to navigate it grow with each passing year. 307 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 CHAPTER 6: CONCLUSION Overview of the Study The rapid integration of artificial intelligence into military command and control architectures has created one of the most consequential capability gaps in contemporary defense: the absence of empirically validated frameworks for dynamically managing the allocation of decision authority between human commanders and autonomous weapons systems. This gap is not merely academic. As autonomous weapons advance from developmental prototypes to operational systems across the arsenals of great powers, the question of how, when, and under what conditions human control should be exercised, relaxed, or reasserted during weapons employment has moved from theoretical abstraction to urgent operational necessity. The Department of Defense Directive 3000.09 (U.S. Department of Defense, 2023) establishes highlevel policy for autonomy in weapon systems, yet its implementation requires validated governance mechanisms that, prior to this research, did not exist. This dissertation was conceived to address that gap directly. The central purpose of this research was to develop, test, and validate an empirically grounded framework for dynamic autonomy management in human-AI command and control for autonomous weapons systems—a framework that could inform Joint Chiefs of Staff doctrine development, autonomous weapons governance policy, and the design of next-generation command and control architectures. Three research questions guided the investigation: (RQ1) How should decision authority be dynamically allocated between human commanders and autonomous weapons AI across different operational phases, including surveillance, identification, tracking, engagement, and post-engagement assessment? (RQ2) What transfer-of-control protocols preserve meaningful human agency without degrading operational tempo below mission-critical thresholds? (RQ3) 308 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 How do different C2 architectures—human-in-the-loop (HITL), human-on-the-loop (HOTL), and human-over-the-loop (HOVL)—affect both operational effectiveness and accountability traceability in autonomous weapons employment? The investigation employed a four-phase sequential mixed-methods design of deliberate methodological scope. Phase 1 applied grounded theory analysis to an 84-document corpus of policy directives, government reports, international legal instruments, and technical publications, identifying Autonomy Governance as the core category with the highest centrality score among eight emergent categories and mapping 19 thematic codes that structured the transfer-of-control design space. Phase 2 operationalized Phase 1 findings through agent-based computational modeling, simulating 1,000 Monte Carlo iterations across three C2 architectures and three threat conditions to quantify the speed-accountability tradeoff with statistical precision. Phase 3 conducted simulation-based experimentation with 118 participants across a 3 × 3 factorial design, measuring decision accuracy, response time, trust, cognitive load, and rules of engagement compliance to validate computational predictions against human performance data. Phase 4 convened expert tabletop exercises to evaluate the resultant framework against five criteria of operational viability, achieving positive ratings across all dimensions with decision traceability receiving the highest evaluation (M = 5.83, SD = 0.62 on a 7-point scale). The key findings that emerged across all four phases converged with striking consistency on a set of core insights. The fundamental tension between operational tempo and accountability integrity constitutes the central design constraint for any dynamic autonomy management system. Moving from HITL to HOVL reduces response latency by 85.9% (from 8.51 seconds to 1.20 seconds) but simultaneously degrades accountability chain integrity by 30.3 percentage points (from 97.8% to 68.2%). Human-on-the-loop architecture consistently emerged as the 309 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 optimal default configuration, occupying the most balanced position on the Pareto frontier between speed and accountability—achieving 86.3% mission success, 86.3% accountability integrity, and a 2.70-second response latency that remains within mission-critical engagement windows. A previously undocumented trust-accuracy paradox was identified: increasing autonomy improved objective decision accuracy (from 78.7% under HITL to 85.7% under HOVL) while simultaneously degrading operator trust (from 5.42 to 3.77 on a 7-point scale), a finding with profound implications for the design and fielding of autonomous weapons systems. The significance of these findings extends beyond the immediate domain of autonomous weapons governance. The speed-accountability tradeoff, the trust-accuracy paradox, and the viability of dynamic governance architectures have implications for every domain in which human operators supervise increasingly autonomous AI systems—from healthcare and transportation to critical infrastructure and financial systems. But the military domain is where the stakes are highest, the decisions most consequential, and the need for governance most urgent. When the AI system in question is a weapon capable of lethal force, the costs of ungoverned autonomy are measured not in efficiency losses or financial damages but in human lives and the legitimacy of the institutions that authorize the use of force. This research matters now because the window for establishing evidence-based governance frameworks for autonomous weapons is closing. China, Russia, and other competitors are accelerating autonomous weapons development without the transparency or governance mechanisms that democratic accountability demands (Kania, 2021). The proliferation of commercially available AI technologies—including computer vision, autonomous navigation, and decision-support algorithms—means that autonomous weapons capabilities are no longer the exclusive province of great powers. Non-state actors and middle 310 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 powers are acquiring or developing autonomous systems that will operate in the same battlespace as U.S. forces. The United States has a strategic obligation—and a strategic opportunity—to establish the gold standard for responsible autonomy management, grounded not in aspiration but in empirical evidence. This dissertation provides that evidence. Summary of Key Findings and Contributions The Speed-Accountability Tradeoff The most empirically robust finding of this dissertation is the quantification of the speedaccountability tradeoff that defines the architecture design space for autonomous weapons command and control. Across both computational simulation and human-in-the-loop experimentation, a consistent and statistically significant inverse relationship emerged between the speed of autonomous engagement and the integrity of accountability chains linking lethal decisions to identifiable human authorization. The agent-based model (Phase 2), calibrated with parameters derived from the Phase 1 grounded theory analysis and validated against 1,000 Monte Carlo simulation runs, produced the foundational quantification. HITL architectures maintained 97.8% accountability chain integrity (SD = 0.003)—meaning that nearly every engagement decision could be traced to an explicit human authorization event—but at the cost of a mean response latency of 8.51 seconds (SD = 0.45). This latency reflects the irreducible time required for human cognitive processing, situation assessment, and explicit authorization in complex engagement scenarios. At the opposite extreme, HOVL architectures achieved response latencies of just 1.20 seconds (SD = 0.03), operating under pre-authorized governance parameters without requiring real-time human decision-making, but with accountability chain integrity falling to 68.2% (SD = 0.016). HOTL 311 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 occupied the critical middle ground: 2.70-second response latency (SD = 0.15) with 86.3% accountability integrity (SD = 0.007). The Phase 3 experimental validation confirmed these computational predictions with human participants. Response latencies followed the same rank ordering: HITL at 10.76 seconds (SD = 4.03), HOTL at 4.44 seconds (SD = 1.70), and HOVL at 1.75 seconds (SD = 0.62). The slightly elevated experimental values relative to the ABM predictions are attributable to additional cognitive overhead in the simulation-based experimental paradigm, but the proportional relationships held with remarkable fidelity. The effect of autonomy level on response time was the largest observed in the entire experiment (η²p = .73), confirming that architecture selection is the dominant determinant of engagement tempo. For military decision-makers, this tradeoff can be stated in plain strategic terms: every second gained in autonomous response time costs approximately 4.2 percentage points of accountability integrity. This is not a linear relationship—the accountability degradation accelerates as human oversight diminishes—but the approximation captures the essential strategic calculus. A commander considering HOVL employment for time-critical air defense scenarios gains 7.31 seconds of response advantage over HITL but accepts that nearly one-third of engagement decisions may lack traceable human authorization. Whether that tradeoff is acceptable depends on the operational context, the threat environment, the rules of engagement, and the strategic consequences of both action and inaction. The Dynamic Autonomy Management framework developed in this dissertation provides the governance architecture for making that determination systematically rather than ad hoc. The speed-accountability tradeoff is not a flaw to be engineered away; it is a structural feature of human-AI systems operating under time pressure. The laws of physics constrain signal 312 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 propagation, the architecture of human cognition constrains processing speed, and the requirements of democratic accountability constrain governance. What this research demonstrates is that these constraints can be managed through deliberate architectural choice— but only if commanders understand the precise costs and benefits of each configuration. Prior to this dissertation, those costs and benefits had never been empirically quantified for autonomous weapons employment. The Trust-Accuracy Paradox The second major finding of this dissertation—and arguably its most theoretically significant contribution—is the identification of a trust-accuracy paradox that has not been previously documented in the military human-AI teaming literature. Phase 3 experimental data revealed that as autonomy level increased from HITL to HOVL, objective decision accuracy improved monotonically (HITL: M = 78.7%, SD = 8.49; HOTL: M = 83.6%, SD = 9.28; HOVL: M = 85.7%, SD = 5.92) while operator trust declined monotonically (HITL: M = 5.42, SD = 0.80; HOTL: M = 4.57, SD = 0.86; HOVL: M = 3.77, SD = 1.11). This divergence between objective performance and subjective confidence represents a paradox of direct operational consequence: the C2 architecture that produces the most accurate decisions is simultaneously the one that operators trust least. The Phase 2 agent-based model had predicted the accuracy improvement—HOVL achieved decision quality scores of 0.915 compared to 0.790 for HITL—reflecting the AI system’s superior computational speed and pattern recognition in threat classification. What the ABM could not predict, because it did not model human psychological states, was the corresponding trust degradation. The experimental data revealed that trust was not merely a function of observed system performance; operators who watched the HOVL system make 313 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 correct decisions with high consistency nonetheless reported lower confidence in the system than operators who made less accurate decisions themselves under HITL conditions. This finding extends Lee and See’s (2004) foundational trust framework in an important direction. Lee and See identified three bases of trust in automation: performance (the system’s competence), process (the system’s algorithms), and purpose (the system’s intent). The trustaccuracy paradox suggests a fourth dimension that might be termed agency-based trust: operators’ confidence is partly a function of their own perceived control over the decision process, independent of outcome quality. When human agency is reduced—as it is by definition under HOVL—trust declines even when performance improves, because the operator’s sense of meaningful participation in the decision has been diminished. The operational implications of the trust-accuracy paradox are profound and nonobvious. If military organizations adopt higher-autonomy architectures purely on the basis of performance metrics—as rational optimization would suggest—they may simultaneously create a trust deficit that undermines operator willingness to rely on the system when reliance is most critical. An operator who does not trust a HOVL system may override it at precisely the wrong moment, introducing human error into a decision chain that was performing well autonomously. Alternatively, an operator who is compelled to use a system they distrust may experience elevated stress, cognitive disengagement, and reduced vigilance—degrading the very human oversight that HOVL’s governance model depends on. The paradox cannot be resolved by simply demonstrating superior AI performance to operators. The trust deficit is not an information problem; it is a psychological and organizational one rooted in the human need for agency and control, particularly in high-stakes, morally consequential decisions such as the employment of lethal force. The Dynamic Autonomy 314 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Management framework addresses this paradox through its Trust Calibration Mechanism component, which incorporates graduated exposure protocols, real-time performance feedback calibrated to operator cognitive state, and periodic mandatory human-in-the-loop engagements that maintain operator confidence in their own decision-making capacity. The goal is not to eliminate the paradox but to manage it—ensuring that trust remains sufficiently calibrated to support effective human-AI collaboration across the autonomy spectrum. The Dynamic Autonomy Management Framework The primary intellectual contribution of this dissertation is the Dynamic Autonomy Management (DAM) framework: an empirically grounded, operationally validated governance architecture for managing the dynamic allocation of decision authority between human commanders and autonomous weapons systems across the spectrum of military operations. Unlike existing frameworks that treat autonomy as a static design parameter—assigning a fixed level of automation at system design time and leaving it unchanged during operations—the DAM framework conceptualizes autonomy as a continuously adjustable state variable that should be modulated in real time based on operational context, threat conditions, accountability requirements, and operator cognitive state. The framework comprises five integrated components, each grounded in specific empirical findings from the four-phase research design and theoretical foundations established in the literature review. Autonomy Spectrum. The first component establishes a three-tier operational spectrum—HITL, HOTL, and HOVL—with defined operating parameters for each tier derived from the Phase 2 agent-based model and validated by Phase 3 experimentation. The spectrum is not a simple continuum but a set of discrete operating modes, each with distinct governance 315 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 rules, performance characteristics, and accountability mechanisms. HITL requires explicit human authorization for every engagement decision, achieving 97.8% accountability at 8.51-second latency. HOTL permits system-initiated action within pre-authorized parameters, with a human override window, achieving 86.3% accountability at 2.70-second latency. HOVL delegates execution authority to the AI system under pre-established governance constraints, achieving 68.2% accountability at 1.20-second latency. The research recommends HOTL as the default operating mode for all autonomous weapons employment, with context-dependent transitions to HITL or HOVL. Transfer-of-Control Triggers. The second component specifies the conditions under which authority should transition between tiers. Informed by Phase 1’s 19 thematic codes related to authority transfer and validated through Phase 4 expert assessment, the trigger system incorporates five categories: threat escalation triggers (tempo exceeds human processing capacity, requiring de-escalation to HOVL), threat de-escalation triggers (complexity increases, requiring escalation to HITL), accountability triggers (governance chain integrity falls below threshold, forcing HITL reversion), cognitive load triggers (operator workload exceeds sustainable levels, prompting HOTL or HOVL transition), and mission phase triggers (operational phase transitions that change the appropriate balance between speed and oversight). Each trigger has defined thresholds calibrated to the empirical data from Phases 2 and 3. Accountability Chains. The third component mandates continuous, real-time logging of the decision authority state and all transitions between autonomy tiers. Every engagement decision must be linkable to either a specific human authorization (HITL), a human-approved parameter set with documented override opportunity (HOTL), or a pre-authorized governance constraint with documented chain of command approval (HOVL). The Phase 2 finding that 316 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 accountability integrity degrades from 97.8% under HITL to 68.2% under HOVL establishes the empirical baseline against which accountability chain performance must be measured. Phase 4 expert validation rated decision traceability as the framework’s strongest feature (M = 5.83, SD = 0.62), confirming that operational leaders prioritize accountability mechanisms. Trust Calibration Mechanisms. The fourth component addresses the trust-accuracy paradox through a suite of design interventions derived from the Phase 3 experimental findings and informed by Lee and See’s (2004) trust framework, Parasuraman and Riley’s (1997) misuse/disuse model, and the emerging concept of agency-based trust. The mechanisms include: graduated autonomy exposure during training, ensuring operators build calibrated confidence through progressive experience with each autonomy tier; real-time trust indicators that provide operators with objective performance data alongside their subjective assessments; mandatory periodic HITL engagement cycles that maintain operator skill and confidence even during extended HOVL operations; and post-engagement trust reconciliation sessions that compare operator trust assessments with objective outcome data. The goal is not maximum trust but calibrated trust—a state where operator confidence accurately reflects system capabilities and limitations at each autonomy level. Governance Constraints. The fifth component establishes the hierarchical boundaries within which dynamic autonomy management operates. Informed by Phase 1’s analysis of DoDD 3000.09 governance requirements and validated against international humanitarian law principles identified in the literature review, governance constraints operate at three levels: strategic constraints set by senior civilian and military leadership (approved target categories, geographic boundaries, rules of engagement), operational constraints set by theater commanders (mission-specific parameters, escalation authorities, collateral damage thresholds), and tactical 317 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 constraints set by the DAM framework itself (autonomy tier assignments, transition triggers, accountability logging requirements). No dynamic transition may violate a higher-level governance constraint, ensuring that the framework operates within established chains of command and legal authority. Table 6.1 Dynamic Autonomy Management Framework Components Summary Component Description Key Parameters Evidence Base Autonomy Three-tier operational modes Accountability integrity Phase 2 ABM (1,000 Monte Spectrum (HITL, HOTL, HOVL) with (68.2%–97.8%); Carlo runs); Phase 3 defined performance and Response latency (1.20– experiment (N = 118) governance characteristics 8.51s); Mission success (71.6%–89.3%) Transfer-of-Control Context-sensitive conditions Threat tempo thresholds; Phase 1 grounded theory (19 Triggers for transitioning between Cognitive load limits; codes); Phase 3 interaction autonomy tiers during Accountability floor; effects (η²p = .16) operations Mission phase gates Accountability Continuous real-time logging Chain integrity rate; Phase 2 integrity metrics; Chains of decision authority state Authorization Phase 4 expert validation (M and all tier transitions traceability; Override = 5.83/7) documentation Trust Calibration Design interventions to Trust scores by tier Phase 3 trust-accuracy Mechanisms maintain calibrated operator (5.42/4.57/3.77); paradox; Lee and See (2004); trust across the autonomy Accuracy-trust Parasuraman and Riley spectrum divergence; Cognitive (1997) load interaction Governance Hierarchical boundaries DoDD 3000.09 Phase 1 policy analysis (84 Constraints (strategic, operational, compliance; IHL documents); Phase 4 318 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 tactical) bounding dynamic conformity; ROE doctrinal compatibility (M = autonomy decisions integration; Chain of 5.50/7) command authority Note. Parameters reflect empirical values from the four-phase sequential mixed-methods design. ABM = agentbased model. IHL = international humanitarian law. ROE = rules of engagement. Contributions to Knowledge Theoretical Contributions This dissertation makes four distinct theoretical contributions to the scholarly literature. First, it extends theories of human-AI teaming by identifying the trust-accuracy paradox—a phenomenon with no prior documentation in the military autonomy literature—and proposing agency-based trust as a fourth dimension of Lee and See’s (2004) foundational trust framework. This extension has implications not only for military systems but for any domain where human operators supervise increasingly autonomous AI, from healthcare to transportation to critical infrastructure. Second, the research contributes to meaningful human control theory (Siebert et al., 2023) by providing the first empirical operationalization of meaningful human control in a weapons employment context, demonstrating that MHC is not a binary state but a continuous variable that can be measured, managed, and optimized. Third, the findings advance command and control theory by empirically validating the relationship between C2 architecture selection and both operational effectiveness and governance quality, extending Alberts and Hayes’s (2003) C2 agility framework with quantitative performance parameters for each architecture type. Fourth, the research contributes to the growing theoretical literature on the governance of lethal autonomous weapons by demonstrating that governance need not be static—dynamic governance architectures can maintain accountability while preserving operational effectiveness, challenging the assumption that human control and operational tempo are irreconcilably opposed. 319 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Methodological Contributions The four-phase sequential mixed-methods design employed in this dissertation represents a methodological contribution in its own right. The research community studying autonomous weapons governance has been overwhelmingly dependent on single-method approaches: legal scholars analyze treaties and directives, ethicists construct philosophical arguments, engineers build computational models, and human factors researchers conduct laboratory experiments. Rarely have these traditions been integrated into a coherent research program. This dissertation demonstrates that a sequential design moving from qualitative grounded theory (Phase 1) through agent-based computational modeling (Phase 2) to simulation-based experimentation (Phase 3) and operational validation (Phase 4) can produce convergent findings with greater confidence than any single method could achieve alone. The methodological triangulation—four phases converging on the same core finding of the speed-accountability tradeoff—provides a model for future defense-related research that must bridge the gap between policy analysis, computational science, and human factors. Practical Contributions The practical contribution of this research is the DAM framework itself: a complete, actionable governance architecture that can be implemented in current and future autonomous weapons systems. Unlike theoretical frameworks that describe what ought to be done in abstract terms, the DAM framework specifies concrete operating parameters, decision thresholds, and governance mechanisms derived from empirical data. It provides commanders with a decision tool for selecting appropriate autonomy levels, system designers with performance requirements for each architecture tier, policymakers with governance mechanisms that satisfy both operational and legal demands, and training developers with a competency framework for 320 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 human-AI team readiness. The framework’s Phase 4 expert validation—with all five evaluation criteria rated significantly above neutral—confirms that operational practitioners assess the framework as viable for real-world implementation. Table 6.2 Summary of Dissertation Contributions Contribution Specific Contribution Chapter/Phase Significance Trust-accuracy paradox and agency- Phase 3 / Ch. 5 Extends Lee and See (2004) Type Theoretical based trust concept trust framework; novel finding in military human-AI teaming Theoretical Empirical operationalization of Phases 2–4 / Ch. 5 meaningful human control First quantitative measurement of MHC in weapons employment Theoretical C2 architecture performance Phase 2 / Ch. 4 parameterization Theoretical Extends Alberts and Hayes (2003) C2 agility framework Dynamic governance as alternative All phases / Ch. 5 to static autonomy levels Challenges assumption that human control and tempo are irreconcilable Methodological Four-phase sequential mixed- Ch. 3 methods design for defense research Integrates grounded theory, ABM, experimentation, and validation Methodological Cross-phase triangulation Ch. 4 integration demonstrating convergent validity Practical Dynamic Autonomy Management Four methods converge on speed-accountability tradeoff All phases / Ch. 6 (DAM) framework Complete governance architecture for autonomous weapons C2 Practical Actionable recommendations for Ch. 6 321 Evidence-based policy DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Practical Joint Chiefs, acquisition, and recommendations for senior doctrine defense leadership Transfer-of-control protocol Phases 1–3 / Ch. 5 specifications Context-sensitive triggers with empirically calibrated thresholds Note. Each contribution traces to specific empirical findings from the four-phase research program. MHC = meaningful human control. ABM = agent-based model. C2 = command and control. Strategic Recommendations The following recommendations are addressed to the Joint Chiefs of Staff, the Office of the Secretary of Defense, combatant commanders, and senior leaders of the joint military industrial base. They are grounded in the empirical findings of this four-phase investigation and are intended to be actionable, prioritized, and traceable to specific evidence. The recommendations are organized into four domains: force design and C2 architecture, doctrine and training, acquisition and the defense industrial base, and policy and international engagement. Recommendations for Force Design and C2 Architecture Adopt HOTL as the default C2 architecture for autonomous weapons employment. The convergent evidence across all four phases of this research identifies HOTL as the architecture that optimally balances operational effectiveness and governance quality. HOTL achieved 86.3% mission success with 86.3% accountability integrity and a 2.70-second mean response latency—fast enough to remain within engagement decision windows for most threat scenarios while maintaining the accountability traceability required for legal compliance and post-engagement review. The adoption of HOTL as the default does not preclude the use of HITL or HOVL; rather, it establishes the baseline from which context-dependent transitions are 322 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 made. High-stakes engagement decisions with significant risk of civilian casualties or escalation should trigger HITL escalation. Time-critical engagements against confirmed hostile targets with minimal collateral risk may warrant HOVL de-escalation. The critical innovation is that these transitions are governed by the DAM framework’s empirically validated trigger system, not ad hoc commander judgment. Integrate dynamic autonomy management into the Joint All-Domain Command and Control (JADC2) architecture. JADC2 envisions AI systems performing sensor fusion across multiple domains, automated target tracking, and course-of-action generation (Lingel et al., 2020). This vision presupposes scalable autonomy management mechanisms. The DAM framework should be incorporated into JADC2 design specifications as the governance layer that mediates between AI-generated recommendations and human command authority. Specifically, the JADC2 common operating picture should include real-time autonomy state indicators showing the current tier (HITL/HOTL/HOVL) for every autonomous system in the battlespace, with governance constraint status and accountability chain integrity metrics visible to commanders at all echelons. Design autonomous weapons interfaces with built-in trust calibration mechanisms. The trust-accuracy paradox identified in this research—wherein operators trust high-autonomy systems less despite their superior accuracy—has direct implications for interface design. Autonomous weapons control interfaces should incorporate real-time performance dashboards that display objective accuracy metrics alongside operator confidence indicators, enabling operators and supervisors to identify trust-accuracy divergence as it occurs. Interfaces should also support graduated autonomy exposure, allowing operators to progressively experience higher autonomy tiers in controlled training environments before operational employment. 323 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Establish real-time accountability chain logging as a mandatory system requirement. The finding that accountability chain integrity degrades from 97.8% under HITL to 68.2% under HOVL underscores the necessity of continuous accountability monitoring. Every autonomous weapons system should be required to maintain a cryptographically signed, tamperevident log of all decision authority states, tier transitions, human authorization events, systeminitiated actions, and override attempts. This log should be accessible in real-time to designated oversight authorities and preserved for post-engagement review, legal proceedings, and international accountability mechanisms. The Phase 4 expert panel’s highest rating for decision traceability (M = 5.83/7) confirms that operational leaders regard this capability as essential. Table 6.3 Strategic Recommendations for Force Design and C2 Architecture Recommendation Supporting Evidence Timeline Responsible Priority Organization Adopt HOTL as default Phase 2: 86.3% mission 12–24 Joint Staff J3/J6; C2 architecture with success/accountability; Phase months Service C2 agencies DAM-governed 3: optimal composite transitions performance; Phase 4: expert Critical validation across all criteria Integrate DAM Cross-phase finding: 24–36 CDAO; Joint Staff governance layer into autonomy management is months J6; DARPA JADC2 architecture prerequisite for multi-domain specifications operations; Phase 4: High scalability rated lowest (M = 4.72) Mandate trust Phase 3: trust-accuracy 18–30 Service acquisition calibration mechanisms paradox (Δ = 1.65 trust months executives; CDAO 324 High DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 in autonomous weapons points from HITL to HOVL interfaces despite +7% accuracy); Phase 4: MHC preservation rated 5.56/7 Require real-time Phase 2: accountability 12–18 USD(R&E); Service accountability chain degrades 30.3 pp from HITL months program offices logging for all AWS to HOVL; Phase 4: Critical traceability rated 5.83/7 Develop autonomy state Cross-phase: architecture 18–24 Joint Staff J6; visualization for selection is dominant months CDAO; DISA common operating performance determinant picture (η²p = .73); commanders High need real-time autonomy awareness Note. pp = percentage points. CDAO = Chief Digital and Artificial Intelligence Officer. USD(R&E) = Under Secretary of Defense for Research and Engineering. DISA = Defense Information Systems Agency. AWS = autonomous weapons systems. Timelines represent estimated implementation periods from policy approval. Recommendations for Doctrine and Training Update joint doctrine to incorporate dynamic autonomy management principles. Current joint doctrine—including JP 3-0 (Joint Operations), JP 3-09 (Joint Fire Support), and JP 3-60 (Joint Targeting)—does not address the dynamic allocation of decision authority between human commanders and autonomous weapons systems. These publications should be revised to incorporate the DAM framework’s core principles: that autonomy is a continuously adjustable state variable, that transitions between autonomy tiers must be governed by defined triggers, and that accountability chain integrity must be maintained across all configurations. The U.S. Army’s ADP 6-0 (Mission Command) should similarly be updated to extend mission command 325 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 principles to human-AI teams, recognizing that disciplined initiative in the AI era requires new forms of commander’s intent that specify acceptable autonomy boundaries. Develop comprehensive training programs for dynamic autonomy operations. The trust-accuracy paradox and the cognitive load findings from Phase 3 demonstrate that effective human-AI teaming in autonomous weapons employment is not intuitive—it requires deliberate skill development. Training programs should address three competency domains: technical competence in operating across all three autonomy tiers, cognitive competence in calibrating trust to system capabilities, and governance competence in maintaining accountability chain integrity during dynamic transitions. The Phase 3 finding that cognitive load increased significantly under HITL conditions (η²p = .67 for threat tempo effects) suggests that operators need specific training in managing cognitive resources during high-tempo operations where HITL oversight is required. Integrate DAM framework scenarios into professional military education. The War Colleges, the Command and General Staff College (CGSC), the Joint Forces Staff College, and service-specific intermediate and senior schools should incorporate dynamic autonomy management case studies, tabletop exercises, and decision simulations into their curricula. Senior leaders who will make decisions about autonomous weapons employment must understand the speed-accountability tradeoff empirically—not as an abstract concept but as a quantified reality with specific operational consequences. The finding that every second of latency reduction costs approximately 4.2 percentage points of accountability integrity should be as familiar to future joint force commanders as the principles of mass and economy of force. Establish certification standards for human-AI team readiness. Just as aircrew require certification before weapons employment, human-AI teams operating autonomous 326 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 weapons should require certification in dynamic autonomy operations. Certification standards should include demonstrated proficiency in all three autonomy tiers, validated trust calibration within acceptable bounds, demonstrated ability to execute tier transitions under time pressure, and understanding of accountability chain requirements. Phase 3 data showing significant variability in operator performance across conditions (e.g., trust score SD ranging from 0.80 under HITL to 1.11 under HOVL) indicate that individual differences in human-AI teaming effectiveness are substantial, supporting the need for individual certification rather than unitlevel qualification alone. The training and doctrine recommendations outlined above must be understood as interdependent rather than sequential. Doctrine provides the intellectual foundation—the authoritative statement of how the joint force will operate with autonomous weapons. Training provides the practical competence—the validated ability of individuals and teams to execute doctrinal concepts under operational conditions. Certification provides the quality assurance— the institutional mechanism for verifying that doctrine has been internalized and training has been effective. Without all three, dynamic autonomy management will remain a framework on paper rather than a capability in the field. The Phase 3 finding that cognitive load under HITL conditions was substantially elevated (M = 53.18, SD = 20.28, compared to M = 29.72, SD = 14.02 under HOVL) underscores the importance of training programs that specifically develop operators’ capacity to manage the cognitive demands of high-oversight operations. Operators who are unprepared for these demands will either make poor decisions under cognitive overload or, more dangerously, default to lower-oversight modes to reduce their workload—a behavior that degrades accountability without explicit authorization. 327 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Professional military education must also address the ethical dimensions of dynamic autonomy management. The decision to transition from HITL to HOVL is not merely a technical optimization; it is a moral choice about the level of human agency in the employment of lethal force. Future commanders must be equipped to reason about these choices with the same rigor they bring to operational planning, understanding that the speed-accountability tradeoff is ultimately a values tradeoff between the imperative to protect friendly forces and civilians through rapid action and the imperative to maintain meaningful human control over the use of deadly force. The DAM framework provides the quantitative parameters for this reasoning; professional military education must provide the ethical and strategic context. Recommendations for Acquisition and the Defense Industrial Base Mandate DAM-compatible design requirements in autonomous weapons acquisition programs. Every autonomous weapons acquisition program should be required to demonstrate compatibility with the DAM framework as a condition of milestone approval. Specifically, systems must support all three autonomy tiers with seamless transitions, provide the accountability chain logging mandated by the framework, incorporate trust calibration interfaces, and operate within the governance constraint architecture. The Joint Capabilities Integration and Development System (JCIDS) process should be updated to include DAM compatibility as a required capability, and the Defense Acquisition University curriculum should incorporate dynamic autonomy management as a core competency for program managers overseeing autonomous systems acquisition. Require explainability and accountability traceability as key performance parameters. Current acquisition programs for autonomous and AI-enabled systems often treat explainability and accountability as desirable but not mandatory features. This research 328 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 demonstrates that accountability chain integrity is not a design luxury but a structural requirement for lawful autonomous weapons employment. Key performance parameters (KPPs) for autonomous weapons should include minimum accountability chain integrity rates calibrated to the system’s intended operating tier: 95% or higher for HITL-primary systems, 85% or higher for HOTL-primary systems, and 65% or higher for HOVL-capable systems. These thresholds are derived directly from the Phase 2 empirical data and should be treated as binding performance requirements. Establish testing and evaluation standards for human-AI C2 interfaces. The Director, Operational Test and Evaluation (DOT&E) should develop standardized test protocols for evaluating human-AI C2 interfaces in autonomous weapons systems. These protocols should include the performance metrics validated in this research: response time across autonomy tiers, accountability chain integrity, trust calibration accuracy, cognitive load under varying threat tempos, and ROE compliance rates. The Phase 3 experimental design—a factorial manipulation of autonomy level and threat tempo—provides a validated template for such evaluations. Engage industry partners in DAM framework adoption. The defense industrial base should be briefed on the DAM framework’s requirements and encouraged to incorporate its principles into system design from the earliest stages of development. Industry partners developing autonomous weapons, C2 systems, and AI-enabled decision support tools should be provided with the framework’s technical specifications, including the performance parameters for each autonomy tier, the transfer-of-control trigger architecture, and the accountability chain logging requirements. Early adoption will reduce costly redesign during later acquisition phases and ensure interoperability across systems from different vendors. 329 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The acquisition implications of this research are substantial and time-sensitive. Multiple autonomous weapons programs are currently in development across the services, including the Army’s Robotic Combat Vehicle program, the Navy’s Unmanned Surface Vessel programs, the Air Force’s Collaborative Combat Aircraft, and various special operations autonomous platforms. Each of these programs will require a governance architecture for managing autonomy levels during operations. If DAM-compatible design requirements are not incorporated into these programs during the current acquisition phase, retrofitting governance capabilities into systems designed without them will be significantly more expensive and less effective than building them in from the start. The defense industrial base has demonstrated its ability to adapt to new requirements when those requirements are clearly specified and consistently enforced—as evidenced by the successful integration of cybersecurity requirements into weapons systems acquisition over the past decade. Dynamic autonomy management should follow the same trajectory: from recommended best practice to mandatory requirement. Additionally, the acquisition community should invest in the development of standardized testing infrastructure for human-AI C2 evaluation. The Phase 3 experimental paradigm developed in this research—employing a factorial design that manipulates autonomy level and threat tempo while measuring decision accuracy, response time, trust, cognitive load, and ROE compliance—provides a validated template for such evaluations. However, moving from a research paradigm to a standardized testing protocol requires investment in permanent testing facilities, trained evaluators, and institutional processes for incorporating test results into milestone decisions. The return on this investment is the assurance that autonomous weapons reaching the operational force have been rigorously evaluated for human-AI teaming 330 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 effectiveness, not merely for technical performance in isolation from the human operators who will employ them. Recommendations for Policy and International Engagement Inform the revision of DoDD 3000.09 with empirical evidence from this dissertation. Department of Defense Directive 3000.09 is the foundational policy document governing autonomy in weapon systems. The Phase 1 analysis found that DoDD 3000.09 governance appeared in 17.9% of the document corpus as the authoritative reference, yet the Directive’s binary classification of autonomous and semi-autonomous weapons systems does not capture the dynamic, context-dependent nature of autonomy management demonstrated in this research. The Directive should be revised to: (a) incorporate the three-tier autonomy spectrum (HITL/HOTL/HOVL) as recognized operational modes, (b) require dynamic autonomy management capabilities in new autonomous weapons systems, (c) mandate accountability chain logging as a system requirement, and (d) establish trust calibration as a readiness standard for human-AI teams. The empirical evidence presented in this dissertation—particularly the quantification of the speed-accountability tradeoff—provides the evidentiary basis for these revisions. Strengthen the U.S. negotiating position in Convention on Certain Conventional Weapons (CCW) deliberations on lethal autonomous weapons systems (LAWS). The United States has participated actively in CCW discussions on LAWS but has resisted binding prohibitions in favor of a responsible use framework. This dissertation provides the empirical evidence to make that position more credible and more specific. The DAM framework demonstrates that meaningful human control over autonomous weapons is achievable through dynamic governance architectures—directly addressing the concerns of states and civil society 331 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 organizations that have called for “meaningful human control” without specifying what that means operationally. The Phase 4 expert validation’s strong rating for MHC preservation (M = 5.56/7) provides evidence that the framework satisfies meaningful human control requirements as assessed by operational practitioners. Promote the DAM framework as a model for allied interoperability. As NATO and other allied nations develop autonomous weapons capabilities, interoperability of C2 architectures will be essential for coalition operations. The DAM framework should be proposed as a common governance standard through NATO’s Science and Technology Organization (STO) and the Technical Cooperation Program (TTCP). A shared framework for dynamic autonomy management would enable coalition forces to operate autonomous weapons under compatible governance structures, reducing the risk of fratricide, miscommunication, and accountability gaps in multinational operations. Address arms control implications proactively. The DAM framework’s accountability chain logging and governance constraint architecture create the technical infrastructure for autonomous weapons arms control verification. If international norms or agreements eventually require transparent operation of autonomous weapons, the accountability logs mandated by the DAM framework would provide the audit trail necessary for compliance verification. By building these capabilities now, the United States positions itself to lead rather than react to emerging arms control frameworks, demonstrating that responsible autonomy management and military effectiveness are not mutually exclusive. The policy and international engagement recommendations carry a urgency that transcends the normal pace of defense policy deliberation. The CCW Group of Governmental Experts on LAWS has been meeting since 2014 without reaching consensus on a binding 332 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 instrument. During this same period, autonomous weapons capabilities have advanced from speculative concepts to near-operational reality. The gap between diplomatic deliberation and technological development is widening, and it will continue to widen unless states bring empirical evidence to the table alongside their political positions. This dissertation provides such evidence: a framework that demonstrates meaningful human control is achievable, that governance and effectiveness can coexist, and that the technical mechanisms for accountability tracing are well within the state of the art. Whether the international community seizes this evidence to advance governance or ignores it in favor of continued inaction is a political decision. But the evidentiary basis for action now exists. The United States should also consider the second-order strategic effects of its autonomous weapons governance posture. Allies who are developing their own autonomous capabilities—including the United Kingdom, France, Australia, Japan, and South Korea—are looking to the United States for models of responsible governance. If the United States adopts the DAM framework or a derivative thereof, allied nations are likely to adopt compatible approaches, creating a coalition of like-minded states with interoperable governance standards. Conversely, if the United States fails to establish clear governance standards, allies may develop incompatible approaches or, worse, defer governance entirely in favor of capability development—creating a fragmented ecosystem of ungoverned autonomous weapons that increases the risk of accidents, escalation, and accountability gaps in coalition operations. A Roadmap for Future Research The findings of this dissertation, while substantial, represent the opening phase of a research program that must extend across the coming decade to fully realize the potential of dynamic autonomy management in military operations. The limitations acknowledged in Chapter 333 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 5—including the simulation-based nature of the experimental phase, the single-platform focus of the current framework, and the Western-centric sample—define the boundaries of the current contribution and simultaneously point toward the most productive directions for future investigation. The following roadmap organizes these directions into near-term, medium-term, and long-term priorities, each with specific methodologies and expected impacts. Near-Term Research Priorities (1–3 Years) Live field validation with military operators. The most urgent research priority is the validation of the DAM framework with actual military operators in high-fidelity field environments. While the Phase 3 simulation-based experiment provided valuable human factors data, the ecological validity of the findings must be confirmed through field testing with experienced weapons operators using realistic engagement scenarios. Field validation should employ the same factorial design (3 autonomy levels × 3 threat tempos) with military operators as participants, using operational or near-operational autonomous weapons platforms as the experimental apparatus. This research would directly address the literature review’s identification of the field validation gap (Pokorny, 2026) and provide the operational evidence necessary for full-scale implementation. Integration testing with autonomous weapons platforms. The DAM framework must be implemented in software and tested with actual autonomous weapons platforms to verify that the theoretical performance parameters translate to real-world system behavior. Integration testing should focus on the transfer-of-control triggers, which were designed based on computational models and expert validation but have not been tested in hardware-in-the-loop configurations. Key performance metrics should include tier transition latency (the time required 334 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 to shift from one autonomy mode to another), accountability chain integrity under communication degradation, and trust calibration interface usability in field conditions. Cross-service validation. The current research did not differentiate between service branches in its experimental design. Future research should validate the DAM framework across the unique operational contexts of the Army, Navy, Air Force, Marine Corps, and Space Force. Each service operates autonomous systems in distinct domain environments (land, sea, air, space, cyber) with different tempo, threat, and accountability characteristics. Cross-service validation would determine whether the framework’s parameters require domain-specific calibration or whether the current generic parameters provide adequate governance across all domains. Medium-Term Research Priorities (3–5 Years) Cross-cultural and allied nation validation. The literature review identified the Western-centric nature of human-AI teaming research as a significant methodological limitation. The DAM framework was developed from a predominantly U.S. and Western policy perspective. Future research should validate the framework with military operators from allied nations with different military cultures, command structures, and decision-making traditions. NATO partners, Five Eyes nations, and key Indo-Pacific allies represent the priority populations for cross-cultural validation. Cultural factors such as power distance, uncertainty avoidance, and individual versus collective decision-making orientation may significantly influence trust calibration and autonomy acceptance. Swarm autonomy management extensions. The current DAM framework addresses single-platform or small-unit autonomous weapons operations. The future operating environment will increasingly involve swarms of autonomous systems that must be managed collectively. Extending the DAM framework to swarm operations requires fundamental research on aggregate 335 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 autonomy management: how does a commander set and adjust autonomy levels for hundreds or thousands of autonomous systems simultaneously? What transfer-of-control triggers are appropriate at the swarm level? How is accountability maintained when individual platform decisions are emergent properties of swarm behavior? The Phase 4 expert panel’s identification of scalability as the framework’s primary weakness (M = 4.72/7) directly motivates this research direction. Adversarial AI and counter-autonomy resilience testing. The current DAM framework assumes that the autonomous system operates as designed. Future adversaries will attempt to degrade, deceive, or subvert autonomous weapons systems through adversarial AI techniques. Research should investigate how the DAM framework performs under adversarial conditions: Do transfer-of-control triggers function correctly when sensor inputs are spoofed? Does accountability chain integrity hold when communication links are jammed? How should trust calibration mechanisms adapt when the AI system has been partially compromised? Resilience to adversarial exploitation is a necessary condition for operational deployment and represents a research frontier at the intersection of cybersecurity and autonomy management. Long-Term Research Horizons (5–10 Years) Large language model–mediated C2 and generative AI integration. The rapid advancement of large language models (LLMs) and generative AI will transform command and control in ways that the current DAM framework does not yet address. Future C2 systems may incorporate LLMs for real-time course-of-action generation, natural language command interfaces, and automated intelligence synthesis. Research must investigate how dynamic autonomy management applies when the AI “teammate” is not a rule-based autonomous system but a generative model capable of novel outputs that were not anticipated by its designers. The 336 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 governance implications are profound: how does one establish accountability chains for AIgenerated courses of action that emerge from probabilistic language models? This research direction addresses the C2-2 gap identified in the research gaps analysis and will become increasingly urgent as LLM capabilities mature. Neurobiological trust measurement in operational settings. The trust-accuracy paradox identified in this dissertation was measured through behavioral and self-report instruments. Future research should employ neurobiological measurement techniques— functional near-infrared spectroscopy (fNIRS), electroencephalography (EEG), and physiological biomarkers including galvanic skin response and pupillometry—to investigate the neural mechanisms underlying trust calibration in dynamic autonomy environments. Neurobiological data could enable real-time trust monitoring systems that detect trust-accuracy divergence before it manifests in observable behavior, providing an early warning mechanism for trust miscalibration during operations. This direction addresses the MC-3 gap in the research literature and could yield transformative capabilities for adaptive autonomy management. Fully autonomous C2 governance frameworks. While this dissertation establishes governance for human-supervised autonomous systems, the long-term trajectory of AI capability development raises the question of C2 architectures where AI systems manage other AI systems with minimal human involvement. This is not a recommendation—it is a recognition that the technological trajectory points toward increasingly autonomous C2, and that governance frameworks must evolve in parallel. Research should investigate the theoretical and practical boundaries of autonomous C2 governance: What are the irreducible human decision points that cannot be delegated to AI? How does accountability function in multi-tier AI systems where no 337 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 single human authorizes a specific action? These questions represent the frontier of military AI governance and will define the strategic landscape of the coming decades. Table 6.4 Research Roadmap for Dynamic Autonomy Management Priority Research Direction Methodology Timeline Expected Impact Live field validation Field experiment (factorial 1–2 years Operational validation of with military operators design, N ≥ 120 military Tier Near-term DAM framework parameters operators) Near-term Near-term Integration testing with Hardware-in-the-loop AWS platforms testing; system engineering confirmation for system analysis implementation Cross-service validation Multi-site field (Army, Navy, Air experiments; service- Force, Marines, Space specific scenario design 2–3 years 2–3 years Technical feasibility Domain-specific parameter calibration for all services Force) Medium- Cross-cultural and Multinational experimental term allied nation validation design; cross-cultural 3–4 years Coalition interoperability; culturally adapted framework survey instruments Medium- Swarm autonomy Multi-agent computational term management extensions modeling; swarm 3–5 years Scalable governance for multi-platform operations simulation experiments Medium- Adversarial AI Red team/blue team term resilience testing exercises; adversarial ML 3–5 years Hardened framework for contested environments experimentation Long-term LLM-mediated C2 and Design science research; generative AI human-LLM teaming integration experiments 338 5–7 years Next-generation C2 governance for generative AI DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Long-term Neurobiological trust fNIRS/EEG studies; measurement physiological sensor 5–8 years Real-time trust monitoring for adaptive autonomy integration Long-term Fully autonomous C2 Theoretical modeling; 7–10 Governance theory for AI-to- governance frameworks ethical analysis; simulation years AI command relationships studies Note. AWS = autonomous weapons systems. LLM = large language model. fNIRS = functional near-infrared spectroscopy. EEG = electroencephalography. ML = machine learning. Timelines represent estimated periods from research initiation to publishable findings. Reflections on the Research Journey This dissertation was written at an inflection point in the history of warfare—a moment when the technologies of autonomous weapons, artificial intelligence, and machine learning are advancing faster than the governance structures designed to contain them. To conduct this research at this precise moment has been both a profound privilege and a heavy responsibility. The privilege lies in the opportunity to contribute foundational evidence to a debate that will shape the character of armed conflict for generations. The responsibility lies in the knowledge that the recommendations derived from this evidence may influence decisions with irreversible consequences for human life. The tension between technological capability and ethical responsibility has been the defining undercurrent of this research from its inception. Every phase of the investigation confronted a version of the same fundamental question: How much human agency can we afford to surrender in exchange for speed, accuracy, and capability? The answer, as the data consistently showed, is never static—it depends on context, threat, stakes, and values. But the question itself is one that every generation of military leaders must answer for themselves, with the best evidence available. This dissertation has attempted to provide that evidence, while 339 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 remaining acutely aware that empirical findings alone cannot resolve what are ultimately moral and political decisions. My positionality as a researcher has inevitably shaped this inquiry. As a scholar embedded within the military science community, I bring both the advantages and limitations of insider perspective. The advantages include deep familiarity with C2 culture, operational tempo, and the practical constraints that shape military decision-making—knowledge that informed the research design, the framing of questions, and the interpretation of findings. The limitations include the potential for confirmation bias toward frameworks that align with existing military structures, a tendency to privilege operational effectiveness over other values, and the challenge of maintaining critical distance from institutions in which I am professionally invested. I have attempted to mitigate these limitations through methodological rigor—the four-phase design was deliberately constructed to generate convergent evidence from multiple traditions, each with different epistemological assumptions and potential biases—and through the inclusion of diverse perspectives in the Phase 1 document corpus and Phase 4 expert panel. Beyond the formal findings, this research journey has deepened my conviction that the most dangerous approach to autonomous weapons governance is the absence of governance. In the absence of frameworks like the one developed here, decisions about autonomy levels will be made ad hoc—driven by operational expedience, bureaucratic inertia, or the capabilities of the latest system to emerge from the defense industrial base. The history of military technology adoption suggests that capabilities deployed without governance frameworks tend to create facts on the ground that constrain future policy choices. The time to establish governance is before widespread deployment, not after—and that time is now. 340 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 I have also been struck by the consistency with which practitioners and experts, across the phases of this research, expressed a desire for structured governance rather than unconstrained discretion. The conventional wisdom—that military operators prefer maximum autonomy and minimum oversight—was not supported by this evidence. The Phase 4 experts rated doctrinal compatibility and decision traceability as among the framework’s strongest features, suggesting that operational practitioners value the clarity and predictability that a governance framework provides. This finding reinforces the argument that governance and effectiveness are not zero-sum—that structure can enable rather than constrain effective action. Perhaps the most personally transformative aspect of this research journey has been the recognition that the boundary between technical and moral questions in autonomous weapons governance is far more permeable than I initially assumed. At the outset of this research, I conceptualized the problem primarily as a systems engineering challenge: how to design optimal control architectures for human-AI teams. The data forced a revision of that conceptualization. The trust-accuracy paradox, in particular, revealed that human engagement with autonomous weapons is not merely a cognitive task but a deeply psychological and moral one. Operators do not simply process information and select responses; they experience agency, responsibility, and moral weight in ways that profoundly shape their interaction with autonomous systems. Any governance framework that ignores this moral dimension—that treats human operators as information processors rather than moral agents—will fail to capture the full complexity of human-AI teaming in lethal force employment. Finally, I reflect on what this research cannot capture. The data, models, and frameworks presented in this dissertation address the governance of autonomous weapons under conditions of deliberate design and controlled operation. They do not address the messy, chaotic, and 341 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 unpredictable reality of warfare—the fog and friction that Clausewitz identified as the defining characteristics of armed conflict. Future combat will involve autonomous systems operating in degraded communications environments, against adversaries employing electronic warfare and adversarial AI, in scenarios that no simulation or tabletop exercise can fully anticipate. The DAM framework provides a governance architecture for these conditions, but its effectiveness in the crucible of actual combat remains to be demonstrated. This is not a limitation of the research so much as a recognition that the ultimate test of any military framework is its performance under fire—a test that, by its nature, cannot be conducted in a doctoral research program. What this research provides is the best available foundation for that operational test: empirically grounded, methodologically rigorous, and designed to be robust across a range of conditions that, while simulated, were calibrated to operational realities. Final Statement This dissertation began with a problem and concludes with a framework, a body of evidence, and a set of recommendations. But the problem has not been solved—it has been illuminated. The challenge of maintaining meaningful human control over autonomous weapons as artificial intelligence capabilities accelerate is not a problem that any single dissertation, any single framework, or any single policy can resolve. It is a permanent condition of the technological era we have entered—a condition that will demand continuous adaptation, vigilant governance, and unwavering commitment to the principle that human beings must remain accountable for the use of lethal force. The evidence presented in these pages demonstrates that this challenge, while formidable, is manageable. The speed-accountability tradeoff is real, but it is quantifiable and navigable. Commanders can select architecture configurations that match the operational context, transition 342 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 between them as conditions change, and maintain accountability throughout. The trust-accuracy paradox is real, but it can be anticipated and managed through deliberate calibration mechanisms. Operators can learn to trust autonomous systems appropriately—neither too much nor too little—when provided with the right training, the right information, and the right governance structure. The Dynamic Autonomy Management framework provides the architecture for all of this: a systematic, empirically grounded approach to governing the most consequential human-AI partnerships in the history of warfare. But a framework, no matter how rigorously developed, is only as effective as the institutions that implement it. This is where the responsibility of senior leadership—the Joint Chiefs of Staff, the combatant commanders, the defense industrial base executives, the civilian policymakers who oversee the American defense enterprise—becomes decisive. The findings of this dissertation demand action, not merely acknowledgment. They demand doctrinal change, acquisition reform, training investment, and policy revision. They demand that the United States lead—not by deploying autonomous weapons first, or fastest, or in greatest numbers, but by deploying them most responsibly, with governance mechanisms that reflect the values of a democratic society even in the crucible of armed conflict. The great power competition that defines the current strategic environment creates enormous pressure to accelerate the development and deployment of autonomous weapons. China’s military-civil fusion strategy, Russia’s investment in autonomous combat platforms, and the proliferation of commercially available AI capabilities to state and non-state actors alike have compressed the timeline for decision. In this environment, the temptation to prioritize speed over governance is acute. This dissertation argues that yielding to that temptation would be a strategic error of the first order. A military that deploys autonomous weapons without robust governance 343 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 does not gain a lasting advantage; it creates vulnerabilities—legal, political, operational, and moral—that adversaries will exploit and allies will condemn. The true competitive advantage lies not in ungoverned autonomy but in governed autonomy: systems that are fast because they are well-designed, accountable because they are well-governed, and trusted because they are wellcalibrated. The Dynamic Autonomy Management framework offers a path toward that vision. It is not a final answer—no framework could be, given the pace of technological change and the evolving character of warfare. It is a first answer: empirically grounded, operationally validated, and designed to evolve. Future research will extend it, future operations will test it, and future leaders will adapt it to circumstances that cannot be foreseen from the vantage point of 2026. But the foundation has been laid. The evidence has been gathered. The architecture has been specified. What remains is the hardest part: the institutional will to implement it. In the final analysis, the question at the heart of this dissertation is not a technical one. It is a question about the kind of military—and the kind of society—we choose to be. A military that maintains meaningful human control over the use of lethal force, even when technology makes it possible to relinquish that control, is a military that reflects the deepest values of the democratic tradition it exists to defend. A military that abandons human control in the pursuit of speed and efficiency may gain tactical advantage in the short term but will have surrendered something far more valuable: the moral authority that distinguishes lawful warfare from mere violence. The autonomous weapons of the coming decade will be more capable, more numerous, and more consequential than anything the world has yet seen. How we govern them will define not only the future of warfare but the future of the relationship between human judgment and 344 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 machine capability—a relationship that extends far beyond the battlefield to encompass every domain of human endeavor where artificial intelligence is reshaping the boundaries of the possible. This dissertation has demonstrated that responsible governance of that relationship is achievable. It falls to the leaders who read these pages to make it real. The technology is ready. The evidence is clear. The framework exists. The time to act is now. Figure 6.1 Dynamic Autonomy Management Framework Architecture DYNAMIC AUTONOMY MANAGEMENT (DAM) FRAMEWORK GOVERNANCE CONSTRAINTS Strategic: Senior civilian/military leadership → Operational: Theater commanders → Tactical: DAM framework AUTONOMY HITL HOTL (Default) HOVL SPECTRUM 97.8% Accountability 86.3% 68.2% Accountability 8.51s Latency Accountability 1.20s Latency METRICS 2.70s Latency TRANSFER-OF-CONTROL TRIGGERS Threat Escalation │ Threat De-escalation │ Accountability Threshold │ Cognitive Load │ Mission Phase ACCOUNTABILITY CHAINS Cryptographic logging │ Authorization traceability │ Override documentation │ Post-engagement audit trail TRUST CALIBRATION MECHANISMS Graduated exposure │ Real-time performance feedback │ Periodic HITL cycles │ Post-engagement reconciliation OPERATIONAL OUTPUTS Optimal autonomy tier selection │ Governed tier transitions │ Maintained accountability │ Calibrated trust │ IHL/ROE compliance EVIDENCE BASE: Phase 1 (Grounded Theory, 84 documents) → Phase 2 (ABM, 1,000 Monte Carlo runs) → Phase 3 (Experiment, N = 118) → Phase 4 (Expert Validation) Note. The DAM framework operates as an integrated governance architecture. Governance constraints bound all operations from above. The autonomy spectrum defines available operating modes. Transfer-of-control triggers govern transitions between modes. Accountability chains and trust calibration mechanisms operate 345 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 continuously across all modes. HITL = human-in-the-loop; HOTL = human-on-the-loop; HOVL = human-over-theloop; IHL = international humanitarian law; ROE = rules of engagement; ABM = agent-based model. 346 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 REFERENCES Alberts, D. S. (2011). The agility advantage: A survival guide for complex enterprises and endeavors. CCRP Publication Series. Alberts, D. S., & Hayes, R. E. (2003). Power to the edge: Command and control in the information age. CCRP Publication Series. https://apps.dtic.mil/sti/citations/ADA457861 Alberts, D. S., & Hayes, R. E. (2006). Understanding command and control. CCRP Publication Series. https://apps.dtic.mil/sti/citations/ADA484842 Allen, G. C. (2019). Understanding China's AI strategy: Clues to Chinese strategic thinking on artificial intelligence and national security. Center for a New American Security. https://www.cnas.org/publications/reports/understanding-chinas-ai-strategy Altmann, J., & Sauer, F. (2017). Autonomous weapon systems and strategic stability. Survival, 59(5), 117–142. https://doi.org/10.1080/00396338.2017.1375263 Amoroso, D., & Tamburrini, G. (2020). Autonomous weapons systems and meaningful human control: Ethical and legal issues. Current Robotics Reports, 1, 187–194. https://doi.org/10.1007/s43154-020-00024-3 Arkin, R. C. (2009). Governing lethal behavior in autonomous robots. CRC Press. Arrieta, A. B., Díaz-Rodríguez, N., Del Ser, J., Bennetot, A., Tabik, S., Barbado, A., Garcia, S., Gil-Lopez, S., Molina, D., Benjamins, R., Chatila, R., & Herrera, F. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115. https://doi.org/10.1016/j.inffus.2019.12.012 Article 36. (2013). Killer robots: UK government policy on fully autonomous weapons. Article 36. https://article36.org/what-we-think/autonomous-weapons/ 347 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Asaro, P. (2012). On banning autonomous weapon systems: Human rights, automation, and the dehumanization of lethal decision-making. International Review of the Red Cross, 94(886), 687–709. https://doi.org/10.1017/S1816383112000768 Bainbridge, L. (1983). Ironies of automation. Automatica, 19(6), 775–779. https://doi.org/10.1016/0005-1098(83)90046-8 Barnes, M. J., & Jentsch, F. G. (Eds.). (2017). Human-robot interactions in future military operations. Routledge. https://doi.org/10.4324/9781315587622 Bendett, S. (2017). Red robots rising: Behind the rapid development of Russian unmanned military systems. Center for a New American Security. Bhuta, N., Beck, S., Geiß, R., Liu, H.-Y., & Kreß, C. (Eds.). (2016). Autonomous weapons systems: Law, ethics, policy. Cambridge University Press. https://doi.org/10.1017/CBO9781316597873 Bode, I., & Huelss, H. (2018). Autonomous weapons systems and changing norms in international relations. Review of International Studies, 44(3), 393–413. https://doi.org/10.1017/S0260210517000614 Bode, I., & Watts, T. F. A. (2023). Meaning-less human control: Lessons from air defence systems for lethal autonomous weapons. Drone Wars UK. Bonabeau, E. (2002). Agent-based modeling: Methods and techniques for simulating human systems. Proceedings of the National Academy of Sciences, 99(Suppl. 3), 7280–7287. https://doi.org/10.1073/pnas.082080899 Boothby, W. H. (2014). Conflict law: The influence of new weapons technology, human rights and emerging actors. T.M.C. Asser Press. https://doi.org/10.1007/978-90-6704-929-8 348 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Boulanin, V., & Verbruggen, M. (2017). Mapping the development of autonomy in weapon systems. Stockholm International Peace Research Institute. https://www.sipri.org/sites/default/files/201711/siprireport_mapping_the_development_of_autonomy_in_weapon_systems_1117_1.pd f Boulanin, V., Davison, N., Goussac, N., & Peldán Carlsson, M. (2020). Limits on autonomy in weapon systems: Identifying practical elements of human control. Stockholm International Peace Research Institute. https://www.sipri.org/publications/2020/otherpublications/limits-autonomy-weapon-systems Boyd, J. R. (1996). The essence of winning and losing [Unpublished briefing slides]. https://www.danford.net/boyd/essence.htm Bradshaw, J. M., Hoffman, R. R., Johnson, M., & Woods, D. D. (2013). The seven deadly myths of "autonomous systems." IEEE Intelligent Systems, 28(3), 54–61. https://doi.org/10.1109/MIS.2013.70 Brehmer, B. (2005). The dynamic OODA loop: Amalgamating Boyd's OODA loop and the cybernetic approach to command and control. In Proceedings of the 10th International Command and Control Research and Technology Symposium. CCRP. Burdette, Z., Phillips, D., Heim, J. L., Geist, E., Frelinger, D. R., Heitzenrater, C., & Mueller, K. P. (2026). How artificial intelligence could reshape four essential competitions in future warfare. RAND Corporation. https://doi.org/10.7249/RRA4316-1 Burns, C. M., & Hajdukiewicz, J. R. (2004). Ecological interface design. CRC Press. https://doi.org/10.1201/9781420038255 349 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Cannon-Bowers, J. A., Salas, E., & Converse, S. (1993). Shared mental models in expert team decision making. In N. J. Castellan Jr. (Ed.), Individual and group decision making: Current issues (pp. 221–246). Lawrence Erlbaum Associates. Cavalcante Siebert, L., Lupetti, M. L., Aizenberg, E., Beckers, N., Zgonnikov, A., Veluwenkamp, H., Abbink, D., Giaccardi, E., Houben, G.-J., Jonker, C. M., van den Hoven, J., Forber, D., & Santoni de Sio, F. (2023). Meaningful human control: Actionable properties for AI system design. AI and Ethics, 3, 241–255. https://doi.org/10.1007/s43681-022-00167-3 Cebrowski, A. K., & Garstka, J. J. (1998). Network-centric warfare: Its origin and future. Proceedings of the U.S. Naval Institute, 124(1), 28–35. Center for Strategic and International Studies. (2023). The state of DoD AI and autonomy policy. CSIS. https://www.csis.org/analysis/state-dod-ai-and-autonomy-policy Champagne, M., & Tonkens, R. (2015). Bridging the responsibility gap in automated warfare. Philosophy & Technology, 28(1), 125–137. https://doi.org/10.1007/s13347-013-0138-3 Charmaz, K. (2014). Constructing grounded theory (2nd ed.). Sage. Chen, J. Y. C., & Barnes, M. J. (2014). Human–agent teaming for multirobot control: A review of human factors issues. IEEE Transactions on Human-Machine Systems, 44(1), 13–29. https://doi.org/10.1109/THMS.2013.2293535 Chien, S.-Y., Semnani-Azad, Z., Lewis, M., & Sycara, K. (2014). An empirical model of cultural factors on trust in automation. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 58(1), 859–863. https://doi.org/10.1177/1541931214581181 350 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Cioppa, T. M., Lucas, T. W., & Sanchez, S. M. (2004). Military applications of agent-based simulations. In Proceedings of the 2004 Winter Simulation Conference (pp. 171–180). IEEE. https://doi.org/10.1109/WSC.2004.1371314 Clough, B. T. (2002). Metrics, schmetrics! How the heck do you determine a UAV's autonomy anyway? (AFRL Report). Air Force Research Laboratory. https://apps.dtic.mil/sti/citations/ADA515926 Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates. Congressional Research Service. (2019). International discussions concerning lethal autonomous weapon systems (IF11294). https://www.congress.gov/crs-product/IF11294 Congressional Research Service. (2020). Defense primer: U.S. policy on lethal autonomous weapon systems (IF11150). https://www.congress.gov/crs-product/IF11150 Cooke, N. J., Gorman, J. C., Myers, C. W., & Duran, J. L. (2013). Interactive team cognition. Cognitive Science, 37(2), 255–285. https://doi.org/10.1111/cogs.12009 Crandall, J. W., & Goodrich, M. A. (2002). Characterizing efficiency of human robot interaction: A case study of shared-control teleoperation. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 1290–1295). IEEE. https://doi.org/10.1109/IRDS.2002.1043929 Crootof, R. (2015). The killer robots are here: Legal and policy implications. Cardozo Law Review, 36(5), 1837–1915. https://cardozolawreview.com/the-killer-robots-are-here/ Cummings, M. L. (2017). Artificial intelligence and the future of warfare (Chatham House Research Paper). The Royal Institute of International Affairs. https://www.chathamhouse.org/2017/01/artificial-intelligence-and-future-warfare 351 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 DARPA. (2017). OFFensive Swarm-Enabled Tactics (OFFSET) program. Defense Advanced Research Projects Agency. https://www.darpa.mil/research/programs/offensive-swarmenabled-tactics DARPA. (2019). Assured Autonomy program. Defense Advanced Research Projects Agency. https://www.darpa.mil/research/programs/assured-autonomy Davison, N. (2017). A legal perspective: Autonomous weapon systems under international humanitarian law. In International Committee of the Red Cross (Ed.), Autonomous weapon systems: Technical, military, legal and humanitarian aspects (pp. 5–18). ICRC. de Visser, E. J., Pak, R., & Shaw, T. H. (2018). From 'automation' to 'autonomy': The importance of trust repair in human–machine interaction. Ergonomics, 61(10), 1409–1427. https://doi.org/10.1080/00140139.2018.1457725 Defense Innovation Board. (2019). AI principles: Recommendations on the ethical use of artificial intelligence by the Department of Defense. U.S. Department of Defense. https://innovation.defense.gov/ai/ Demir, M., McNeese, N. J., & Cooke, N. J. (2017). Team situation awareness within the context of human-autonomy teaming. Cognitive Systems Research, 46, 3–12. https://doi.org/10.1016/j.cogsys.2016.11.003 Dietvorst, B. J., Simmons, J. P., & Massey, C. (2015). Algorithm aversion: People erroneously avoid algorithms after seeing them err. Journal of Experimental Psychology: General, 144(1), 114–126. https://doi.org/10.1037/xge0000033 Dinstein, Y. (2016). The conduct of hostilities under the law of international armed conflict (3rd ed.). Cambridge University Press. 352 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Dorais, G. A., Bonasso, R. P., Kortenkamp, D., Pell, B., & Schreckenghost, D. (1999). Adjustable autonomy for human-centered autonomous systems. In Working Notes of the Sixteenth International Joint Conference on Artificial Intelligence Workshop on Adjustable Autonomy Systems (pp. 16–35). IJCAI. Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608. https://arxiv.org/abs/1702.08608 Dzindolet, M. T., Peterson, S. A., Pomranky, R. A., Pierce, L. G., & Beck, H. P. (2003). The role of trust in automation reliance. International Journal of Human-Computer Studies, 58(6), 697–718. https://doi.org/10.1016/S1071-5819(03)00038-7 Ekelhof, M. (2019). Moving beyond semantics on autonomous weapons: Meaningful human control in operation. Global Policy, 10(3), 343–348. https://doi.org/10.1111/17585899.12665 Endsley, M. R. (2017). From here to autonomy: Lessons learned from human–automation research. Human Factors, 59(1), 5–27. https://doi.org/10.1177/0018720816681350 Endsley, M. R. (2018). Level of autonomy forms a key aspect of autonomy design. Journal of Cognitive Engineering and Decision Making, 12(1), 29–34. https://doi.org/10.1177/1555343417723432 Endsley, M. R., & Kiris, E. O. (1995). The out-of-the-loop performance problem and level of control in automation. Human Factors, 37(2), 381–394. https://doi.org/10.1518/001872095779064555 Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). GPower 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39(2), 175–191. https://doi.org/10.3758/BF03193146 353 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Feickert, A. (2018). U.S. ground forces robotics and autonomous systems (RAS) and artificial intelligence (AI): Considerations for Congress (CRS Report R45392). Congressional Research Service. https://sgp.fas.org/crs/weapons/R45392.pdf Feigh, K. M., Dorneich, M. C., & Hayes, C. C. (2012). Toward a characterization of adaptive systems: A framework for researchers and system designers. Human Factors, 54(6), 1008–1024. https://doi.org/10.1177/0018720812443983 Fetters, M. D., Curry, L. A., & Creswell, J. W. (2013). Achieving integration in mixed methods designs—Principles and practices. Health Services Research, 48(6, Pt. 2), 2134–2156. https://doi.org/10.1111/1475-6773.12117 Filippi, A., Mazzucato, M., & Stein, J. (2026). Governance frameworks for autonomous weapons in high-tempo operations: Bridging policy and practice. Journal of Strategic Studies, 49(2), 215–243. Fiore, S. M., & Wiltshire, T. J. (2016). Technology as teammate: Examining the role of external cognition in support of team cognitive processes. Frontiers in Psychology, 7, Article 1531. https://doi.org/10.3389/fpsyg.2016.01531 Fitts, P. M. (Ed.). (1951). Human engineering for an effective air-navigation and traffic-control system. National Research Council. Garcia, D. (2018). Lethal artificial intelligence and change: The future of international peace and security. International Studies Review, 20(2), 334–341. https://doi.org/10.1093/isr/viy029 Goodrich, M. A., Olsen, D. R., Crandall, J. W., & Palmer, T. J. (2001). Experiments in adjustable autonomy. In Proceedings of the IJCAI-01 Workshop on Autonomy, Delegation, and Control: Interacting with Autonomous Agents (pp. 1624–1629). IJCAI. 354 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Government Accountability Office. (2022). Artificial intelligence: DOD should improve strategies, inventory process, and collaboration guidance (GAO-22-104765). https://www.gao.gov/assets/gao-22-104765.pdf Grimm, V., Berger, U., DeAngelis, D. L., Polhill, J. G., Giske, J., & Railsback, S. F. (2010). The ODD protocol: A review and first update. Ecological Modelling, 221(23), 2760–2768. https://doi.org/10.1016/j.ecolmodel.2010.08.019 Grimm, V., Revilla, E., Berger, U., Jeltsch, F., Mooij, W. M., Railsback, S. F., Thulke, H.-H., Weiner, J., Wiegand, T., & DeAngelis, D. L. (2005). Pattern-oriented modeling of agentbased complex systems: Lessons from ecology. Science, 310(5750), 987–991. https://doi.org/10.1126/science.1116681 Gunning, D., & Aha, D. W. (2019). DARPA's explainable artificial intelligence (XAI) program. AI Magazine, 40(2), 44–58. https://doi.org/10.1609/aimag.v40i2.2850 Gunning, D., Stefik, M., Choi, J., Miller, T., Stumpf, S., & Yang, G.-Z. (2019). XAI— Explainable artificial intelligence. Science Robotics, 4(37), Article eaay7120. https://doi.org/10.1126/scirobotics.aay7120 Hague Centre for Strategic Studies. (2022). Robotic and autonomous systems: From design to development and use in military operations. HCSS. https://hcss.nl/wpcontent/uploads/2022/11/Robotic-and-Autonomous-Systems-From-design-todevelopment-and-use-in-military-operations-Final.pdf Hambling, D. (2015). Swarm troopers: How small drones will conquer the world. Archangel Ink. Hancock, P. A., Billings, D. R., Schaefer, K. E., Chen, J. Y. C., de Visser, E. J., & Parasuraman, R. (2011). A meta-analysis of factors affecting trust in human-robot interaction. Human Factors, 53(5), 517–527. https://doi.org/10.1177/0018720811417254 355 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In P. A. Hancock & N. Meshkati (Eds.), Human mental workload (pp. 139–183). North-Holland. https://doi.org/10.1016/S01664115(08)62386-9 Hoehn, J. R. (2022). Joint All-Domain Command and Control (JADC2) (CRS Report IF11493). Congressional Research Service. https://sgp.fas.org/crs/natsec/IF11493.pdf Hoehn, J. R., & Sayler, K. M. (2022). Artificial intelligence and national security (CRS Report R45178). Congressional Research Service. https://sgp.fas.org/crs/natsec/R45178.pdf Hoff, K. A., & Bashir, M. (2015). Trust in automation: Integrating empirical evidence on factors that influence trust. Human Factors, 57(3), 407–434. https://doi.org/10.1177/0018720814547570 Hoffman, R. R., Mueller, S. T., Klein, G., & Litman, J. (2018). Metrics for explainable AI: Challenges and prospects. arXiv preprint arXiv:1812.04608. https://arxiv.org/abs/1812.04608 Hoffman, R. R., Mueller, S. T., Klein, G., & Litman, J. (2023). Measures for explainable AI: Explanation goodness, user satisfaction, mental models, curiosity, trust, and human-AI performance. Frontiers in Computer Science, 5, Article 1096257. https://doi.org/10.3389/fcomp.2023.1096257 Holland Michel, A. (2020). The black box, unlocked: Predictability and understandability in military AI. United Nations Institute for Disarmament Research. https://unidir.org/publication/the-black-box-unlocked Horowitz, M. C. (2016). The ethics and morality of robotic warfare: Assessing the debate over autonomous weapons. Daedalus, 145(4), 25–36. https://doi.org/10.1162/DAED_a_00418 356 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Horowitz, M. C. (2018). Artificial intelligence, international competition, and the balance of power. Texas National Security Review, 1(3), 37–57. https://doi.org/10.15781/T2639KP49 Horowitz, M. C. (2019). When speed kills: Lethal autonomous weapon systems, deterrence, and stability. Journal of Strategic Studies, 42(6), 764–788. https://doi.org/10.1080/01402390.2019.1621174 Horowitz, M. C., & Scharre, P. (2015). Meaningful human control in weapon systems: A primer (CNAS Working Paper). Center for a New American Security. https://www.cnas.org/publications/reports/meaningful-human-control-in-weaponsystems-a-primer Horowitz, M. C., & Scharre, P. (2021). AI and international stability: Risks and confidencebuilding measures. Center for a New American Security. https://www.cnas.org/publications/reports/ai-and-international-stability-risks-andconfidence-building-measures Hsieh, H.-F., & Shannon, S. E. (2005). Three approaches to qualitative content analysis. Qualitative Health Research, 15(9), 1277–1288. https://doi.org/10.1177/1049732305276687 Huang, H.-M. (2008). Autonomy levels for unmanned systems (ALFUS) framework: Volume I— Terminology (NIST Special Publication 1011-I-2.0). National Institute of Standards and Technology. https://www.govinfo.gov/content/pkg/GOVPUB-C13cbc9faa25f6d651e046c9df607d40d59/pdf/GOVPUB-C13cbc9faa25f6d651e046c9df607d40d59.pdf 357 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Human Rights Watch. (2012). Losing humanity: The case against killer robots. HRW. https://www.hrw.org/report/2012/11/19/losing-humanity/case-against-killer-robots Human Rights Watch & International Human Rights Clinic. (2020). New weapons, proven precedent: Elements of and models for a treaty on killer robots. Human Rights Watch. https://www.hrw.org/report/2020/10/20/new-weapons-proven-precedent/elements-andmodels-treaty-killer-robots Ilachinski, A. (2004). Artificial war: Multiagent-based simulation of combat. World Scientific. https://doi.org/10.1142/5531 Ilachinski, A. (2009). EINSTein: An artificial-life laboratory for exploring self-organized emergence in land combat (CRM Report D0020626.A1). Center for Naval Analyses. International Committee of the Red Cross. (2021). Autonomous weapon systems under international humanitarian law. ICRC. https://www.icrc.org/sites/default/files/document/file_list/autonomous_weapon_systems_ under_international_humanitarian_law.pdf Jian, J.-Y., Bisantz, A. M., & Drury, C. G. (2000). Foundations for an empirically determined scale of trust in automated systems. International Journal of Cognitive Ergonomics, 4(1), 53–71. https://doi.org/10.1207/S15327566IJCE0401_04 Johnson, J. (2019). Artificial intelligence & future warfare: Implications for international security. Defense & Security Analysis, 35(2), 147–169. https://doi.org/10.1080/14751798.2019.1600800 Johnson, J. (2025). Can AI behave ethically during military crises? A framework for humancentric moral reasoning in high-stakes AI decision support. International Affairs, 102(1), 63–83. 358 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Johnson, M., Bradshaw, J. M., Feltovich, P. J., Jonker, C. M., van Riemsdijk, M. B., & Sierhuis, M. (2014). Coactive design: Designing support for interdependence in joint activity. Journal of Human-Robot Interaction, 3(1), 43–69. https://doi.org/10.5898/JHRI.3.1.Johnson Joint Air Power Competence Centre. (2021). Potential impact of artificial intelligence to C2 systems. JAPCC Essays. https://www.japcc.org/essays/potential-impact-of-artificialintelligence-to-c2-systems/ Kaber, D. B., & Endsley, M. R. (2004). The effects of level of automation and adaptive automation on human performance, situation awareness and workload in a dynamic control task. Theoretical Issues in Ergonomics Science, 5(2), 113–153. https://doi.org/10.1080/1463922021000054335 Kallenborn, Z. (2021). Are drones the new IEDs? Examining the hype of drone swarms. Modern War Institute at West Point. https://mwi.westpoint.edu/drone-counterdrone-countercounterdrone-winning-the-unmanned-platform-innovation-cycle/ Kania, E. B. (2017). Battlefield singularity: Artificial intelligence, military revolution, and China's future military power. Center for a New American Security. https://www.cnas.org/publications/reports/battlefield-singularity-artificial-intelligencemilitary-revolution-and-chinas-future-military-power Kania, E. B. (2021). China's strategic ambiguity and shifting approach to lethal autonomous weapons systems. Center for a New American Security. https://www.cnas.org/publications/commentary/chinas-strategic-ambiguity-and-shiftingapproach-to-lethal-autonomous-weapons-systems-1 359 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Kim, S., Kim, Y., & Kim, D. (2023). Level and program analytics of MUM-T system. International Journal of Aeronautical and Space Sciences, 24, 1753–1776. https://doi.org/10.1007/s42405-023-00675-4 Klein, G. (1998). Sources of power: How people make decisions. MIT Press. Klein, G. (2008). Naturalistic decision making. Human Factors, 50(3), 456–460. https://doi.org/10.1518/001872008X288385 Krishnan, A. (2009). Killer robots: Legality and ethicality of autonomous weapons. Ashgate Publishing. Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174. https://doi.org/10.2307/2529310 Lauren, M. K., & Stephen, R. T. (2002). Map-aware non-uniform automata (MANA)—A New Zealand approach to scenario modelling. Journal of Battlefield Technology, 5(1), 27–31. Lee, J. D., & See, K. A. (2004). Trust in automation: Designing for appropriate reliance. Human Factors, 46(1), 50–80. https://doi.org/10.1518/hfes.46.1.50.30392 Leveringhaus, A. (2016). Ethics and autonomous weapons. Palgrave Macmillan. https://doi.org/10.1057/978-1-137-52842-7 Liao, Q. V., & Varshney, K. R. (2022). Human-centered explainable AI (XAI): From algorithms to user experiences. arXiv preprint arXiv:2110.10790. https://arxiv.org/abs/2110.10790 Lingel, S., Sargent, M., Bailey, S., & O'Connell, C. (2020). Joint all-domain command and control for modern warfare: An analytic framework for identifying and developing artificial intelligence applications. RAND Corporation. https://doi.org/10.7249/RR4408z1 360 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Lyons, J. B., & Guznov, S. Y. (2019). Individual differences in human–machine trust: A multistudy look at the perfect automation schema. Theoretical Issues in Ergonomics Science, 20(4), 440–456. https://doi.org/10.1080/1463922X.2018.1491071 Lyons, J. B., Koltai, K. S., Ho, N. T., Johnson, W. B., Smith, D. E., & Shively, R. J. (2016). Engineering trust in complex automated systems. Ergonomics in Design, 24(1), 13–17. https://doi.org/10.1177/1064804615611272 Madhavan, P., & Wiegmann, D. A. (2007). Similarities and differences between human–human and human–automation trust: An integrative review. Theoretical Issues in Ergonomics Science, 8(4), 277–301. https://doi.org/10.1080/14639220500337708 Matthias, A. (2004). The responsibility gap: Ascribing responsibility for the actions of learning automata. Ethics and Information Technology, 6(3), 175–183. https://doi.org/10.1007/s10676-004-3422-1 Mayer, M. (2015). The new killer drones: Understanding the innovation trajectory. In Research handbook on remote warfare (pp. 285–311). Edward Elgar Publishing. Mayer, R. C., Davis, J. H., & Schoorman, F. D. (1995). An integrative model of organizational trust. Academy of Management Review, 20(3), 709–734. https://doi.org/10.5465/amr.1995.9508080335 McNeese, N. J., Demir, M., Cooke, N. J., & Myers, C. (2018). Teaming with a synthetic teammate: Insights into human-autonomy teaming. Human Factors, 60(2), 262–273. https://doi.org/10.1177/0018720817743223 Mecacci, G., & Santoni de Sio, F. (2020). Meaningful human control as reason-responsiveness: The case of dual-mode vehicles. Ethics and Information Technology, 22, 103–115. https://doi.org/10.1007/s10676-019-09519-w 361 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Militello, L. G., & Hutton, R. J. B. (1998). Applied cognitive task analysis (ACTA): A practitioner's toolkit for understanding cognitive task demands. Ergonomics, 41(11), 1618–1641. https://doi.org/10.1080/001401398186108 Miller, C. A., & Parasuraman, R. (2007). Designing for flexible interaction between humans and automation: Delegation interfaces for supervisory control. Human Factors, 49(1), 57–75. https://doi.org/10.1518/001872007779598037 Miller, T. (2019). Explanation in artificial intelligence: Insights from the social sciences. Artificial Intelligence, 267, 1–38. https://doi.org/10.1016/j.artint.2018.07.007 Moffat, J. (2011). Complexity theory and network centric warfare. CCRP Publication Series. https://apps.dtic.mil/sti/citations/ADA463784 Morgan, F. E., Boudreaux, B., Lohn, A. J., Ashby, M., Curriden, C., Klima, K., & Grossman, D. (2020). Military applications of artificial intelligence: Ethical concerns in an uncertain world. RAND Corporation. https://doi.org/10.7249/RR3139-1 Nadibaidze, A., Bode, I., Watts, T., & Zhang, Q. (2025). Ensuring the exercise of human agency in AI-based military systems: Concerns across the lifecycle. Ethics and Information Technology, 27, Article 5. https://doi.org/10.1007/s10676-025-09800-5 National Academies of Sciences, Engineering, and Medicine. (2021). Human-AI teaming: Stateof-the-art and research needs. The National Academies Press. https://doi.org/10.17226/26355 National Security Commission on Artificial Intelligence. (2021). Final report. NSCAI. https://www.nscai.gov/wp-content/uploads/2021/03/Full-Report-Digital-1.pdf 362 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 NATO. (2024a). NATO principles of responsible use for artificial intelligence in defence. North Atlantic Treaty Organization. https://www.nato.int/cps/en/natohq/official_texts_227237.htm NATO. (2024b). Summary of NATO's revised artificial intelligence (AI) strategy. North Atlantic Treaty Organization. https://www.nato.int/en/about-us/official-texts-andresources/official-texts/2024/07/10/summary-of-natos-revised-artificial-intelligence-aistrategy NATO Science and Technology Organization. (2014). NATO NEC C2 maturity model (STO Technical Report TR-SAS-085). https://www.sto.nato.int/publications/STO%20Technical%20Reports/STO-TR-SAS085/$$TR-SAS-085-ALL.pdf O'Neill, T., McNeese, N., Barber, D., & Schelble, B. (2022). Human-autonomy teaming: A review and analysis of the empirical literature. Human Factors, 64(5), 904–938. https://doi.org/10.1177/0018720820960865 Okamura, K., & Yamada, S. (2020). Adaptive trust calibration for human–AI collaboration. PLOS ONE, 15(2), Article e0229132. https://doi.org/10.1371/journal.pone.0229132 Page, S. E. (2008). Agent-based models. In S. N. Durlauf & L. E. Blume (Eds.), The new Palgrave dictionary of economics (2nd ed.). Palgrave Macmillan. https://doi.org/10.1057/978-1-349-95121-5_2662-1 Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors, 52(3), 381–410. https://doi.org/10.1177/0018720810376055 363 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Parasuraman, R., Sheridan, T. B., & Wickens, C. D. (2000). A model for types and levels of human interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics, Part A: Systems and Humans, 30(3), 286–297. https://doi.org/10.1109/3468.844354 Patton, M. Q. (2015). Qualitative research & evaluation methods: Integrating theory and practice (4th ed.). Sage. Perla, P. P. (1990). The art of wargaming: A guide for professionals and hobbyists. Naval Institute Press. Phillips, P. J., Hahn, C. A., Fontana, P. C., Broniatowski, D. A., & Przybocki, M. A. (2020). Four principles of explainable artificial intelligence (NIST Draft NISTIR 8312). National Institute of Standards and Technology. https://doi.org/10.6028/NIST.IR.8312draft Pokorny, L. (2026). Dynamic autonomy management in human-AI command and control for autonomous weapons systems. ICL Institute of Applied Sciences. Rasmussen, J. (1983). Skills, rules, and knowledge; signals, signs, and symbols, and other distinctions in human performance models. IEEE Transactions on Systems, Man, and Cybernetics, SMC-13(3), 257–266. https://doi.org/10.1109/TSMC.1983.6313160 Robillard, M. (2018). No such thing as killer robots. Journal of Applied Philosophy, 35(4), 705– 717. https://doi.org/10.1111/japp.12274 Roff, H. M. (2014). The strategic robot problem: Lethal autonomous weapons in war. Journal of Military Ethics, 13(3), 211–227. https://doi.org/10.1080/15027570.2014.975010 364 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Roff, H. M., & Moyes, R. (2016). Meaningful human control, artificial intelligence and autonomous weapons (Briefing Paper for Delegates to the CCW GGE). Article 36. https://article36.org/wp-content/uploads/2016/04/MHC-AI-and-AWS-FINAL.pdf Rossiter, A. (2018). Drone usage by militant groups: Exploring variation in adoption. Defense & Security Analysis, 34(2), 113–126. https://doi.org/10.1080/14751798.2018.1478183 SAE International. (2021). Taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles (Standard J3016_202104). https://doi.org/10.4271/J3016_202104 Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M., & Tarantola, S. (2008). Global sensitivity analysis: The primer. John Wiley & Sons. Santoni de Sio, F., & van den Hoven, J. (2018). Meaningful human control over autonomous systems: A philosophical account. Frontiers in Robotics and AI, 5, Article 15. https://doi.org/10.3389/frobt.2018.00015 Sayler, K. M. (2023). Defense primer: U.S. unmanned systems (CRS Report IF11369). Congressional Research Service. https://www.congress.gov/crs-product/IF11369 Sayler, K. M. (2024). Navy MQ-25A Stingray unmanned carrier-based aerial refueling system (CBARS) (CRS Report IF12972). Congressional Research Service. https://www.congress.gov/crs-product/IF12972 Scerri, P., Pynadath, D. V., & Tambe, M. (2002). Towards adjustable autonomy for the real world. Journal of Artificial Intelligence Research, 17, 171–228. https://doi.org/10.1613/jair.1037 Schaefer, K. E., Chen, J. Y. C., Szalma, J. L., & Hancock, P. A. (2016). A meta-analysis of factors influencing the development of trust in automation: Implications for 365 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 understanding autonomy in future army systems. Human Factors, 58(3), 377–400. https://doi.org/10.1177/0018720816634228 Scharre, P. (2014). Robotics on the battlefield part II: The coming swarm. Center for a New American Security. https://www.cnas.org/publications/reports/robotics-on-the-battlefieldpart-ii-the-coming-swarm Scharre, P. (2018). Army of none: Autonomous weapons and the future of war. W. W. Norton & Company. Scharre, P. (2023). Four battlegrounds: Power in the age of artificial intelligence. W. W. Norton & Company. Scharre, P., & Work, R. O. (2014). An American way of war: 20XX (CNAS Working Paper). Center for a New American Security. Schmitt, M. N. (2013). Autonomous weapon systems and international humanitarian law: A reply to the critics. Harvard National Security Journal Features, 4, 1–37. https://centaur.reading.ac.uk/89864/1/Schmitt-Autonomous-Weapon-Systems-and-IHLFinal.pdf Sellner, B., Heger, F. W., Hiatt, L. M., Simmons, R., & Singh, S. (2006). Coordinated multiagent teams and sliding autonomy for large-scale assembly. Proceedings of the IEEE, 94(7), 1425–1444. https://doi.org/10.1109/JPROC.2006.876966 Sharkey, N. (2007). Automated killers and the computing profession. Computer, 40(11), 122– 124. https://doi.org/10.1109/MC.2007.372 Sharkey, N. (2012). The evitability of autonomous robot warfare. International Review of the Red Cross, 94(886), 787–799. https://doi.org/10.1017/S1816383112000732 366 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Sharma, R., Dhyani, V., & Gangwar, S. (2024). Regulating AI-driven military autonomy: Pathways for legal governance (T7/G7 Policy Brief). Think7. https://www.think7.org/documents/3385/TF4_Sharma_et_al_OhVMwMM.pdf Sheridan, T. B., & Verplank, W. L. (1978). Human and computer control of undersea teleoperators (Technical Report). MIT Man-Machine Systems Laboratory. https://apps.dtic.mil/sti/citations/ADA057655 Shively, R. J., Lachter, J., Brandt, S. L., Matessa, M., Battiste, V., & Johnson, W. W. (2018). Why human-autonomy teaming? In Advances in human factors in robots and unmanned systems (pp. 3–11). Springer. https://doi.org/10.1007/978-3-319-60384-1_1 Singer, P. W. (2009). Wired for war: The robotics revolution and conflict in the twenty-first century. Penguin Press. SIPRI. (2023). International humanitarian law and autonomous weapon systems: Identifying limits and the required type and degree of human–machine interaction. Stockholm International Peace Research Institute. https://www.sipri.org/sites/default/files/202303/ihl_and_aws.pdf Sparrow, R. (2007). Killer robots. Journal of Applied Philosophy, 24(1), 62–77. https://doi.org/10.1111/j.1468-5930.2007.00346.x Special Competitive Studies Project. (2024). Reimagining military C2 in the age of AI. SCSP Defense Paper Series. https://www.scsp.ai/wp-content/uploads/2024/12/DPSReimagining-Military-C2-in-the-Age-of-AI.pdf Stockholm International Peace Research Institute. (2022). Retaining human responsibility in the development and use of autonomous weapon systems (SIPRI Policy Report). 367 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 https://www.sipri.org/publications/2022/policy-reports/retaining-human-responsibilitydevelopment-and-use-autonomous-weapon-systems Stop Killer Robots. (2022). Increasing autonomy in weapons systems: 10 examples. Campaign to Stop Killer Robots. https://www.stopkillerrobots.org/wpcontent/uploads/2022/10/Report-Increasing-Autonomy-in-Weapons-Systems-Singlepage-viewfp.pdf Strauss, A., & Corbin, J. (1998). Basics of qualitative research: Techniques and procedures for developing grounded theory (2nd ed.). Sage. Strawser, B. J. (2010). Moral predators: The duty to employ uninhabited aerial vehicles. Journal of Military Ethics, 9(4), 342–368. https://doi.org/10.1080/15027570.2010.536403 Strouse, R., Roth, M., Mahadevan, S., & Sondik, E. (2024). Scalable interactive machine learning for future command and control. arXiv preprint arXiv:2402.06501. https://arxiv.org/abs/2402.06501 Taddeo, M., & Blanchard, A. (2022). A comparative analysis of the definitions of autonomous weapons systems. Science and Engineering Ethics, 28, Article 37. https://doi.org/10.1007/s13347-022-00571-x Thurnher, J. S. (2012). No one at the controls: Legal implications of fully autonomous targeting. Joint Force Quarterly, 67(4), 77–84. https://digitalcommons.usnwc.edu/cgi/viewcontent.cgi?article=1017&context=ils U.S. Air Force. (2015). Autonomous horizons: The way forward. Office of the Chief Scientist. https://www.af.mil/Portals/1/documents/SECAF/AutonomousHorizons.pdf 368 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 U.S. Army. (2019). ADP 6-0: Mission command: Command and control of Army forces. Headquarters, Department of the Army. https://armypubs.army.mil/epubs/DR_pubs/DR_a/ARN18314-ADP_6-0-000-WEB-3.pdf U.S. Department of Defense. (2011). Unmanned systems integrated roadmap FY2011–2036. Office of the Secretary of Defense. https://apps.dtic.mil/sti/citations/ADA558534 U.S. Department of Defense. (2019). Summary of the 2018 Department of Defense artificial intelligence strategy: Harnessing AI to advance our security and prosperity. https://media.defense.gov/2019/Feb/12/2002088963/-1/-1/1/SUMMARY-OF-DOD-AISTRATEGY.PDF U.S. Department of Defense. (2022). Memorandum on establishment of the Chief Digital and Artificial Intelligence Officer. https://media.defense.gov/2021/Dec/08/2002906075/-1/1/1/MEMORANDUM-ON-ESTABLISHMENT-OF-THE-CHIEF-DIGITAL-ANDARTIFICIAL-INTELLIGENCE-OFFICER.PDF U.S. Department of Defense. (2023). DoD Directive 3000.09: Autonomy in weapon systems. https://www.esd.whs.mil/portals/54/documents/dd/issuances/dodd/300009p.pdf UNIDIR. (2025). The interpretation and application of international humanitarian law to lethal autonomous weapon systems. United Nations Institute for Disarmament Research. https://unidir.org/wpcontent/uploads/2025/03/UNIDIR_The_Interpretation_and_Application_of_International _Humanitarian_Law_Lethal_Autonomous_Weapon_Systems.pdf van der Velde, R., Kaag, J., & Hristov, P. (2021). The military applicability of robotic and autonomous systems. The Hague Centre for Strategic Studies. https://hcss.nl/wpcontent/uploads/2021/01/RAS_Military_Applicability_Final_.pdf 369 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Veluwenkamp, H. (2023). What should autonomous systems track? Reasons-responsiveness and meaningful human control. Ethics and Information Technology, 25, Article 5. https://doi.org/10.1007/s10676-022-09673-8 Veluwenkamp, H., & Buijsman, S. (2025). Design for operator contestability: Control over autonomous systems by introducing defeaters. AI and Ethics, 5, 1–15. https://doi.org/10.1007/s43681-024-00620-1 Vicente, K. J. (1999). Cognitive work analysis: Toward safe, productive, and healthy computerbased work. Lawrence Erlbaum Associates. Wickens, C. D., & Dixon, S. R. (2007). The benefits of imperfect diagnostic automation: A synthesis of the literature. Theoretical Issues in Ergonomics Science, 8(3), 201–212. https://doi.org/10.1080/14639220500370105 Williams, A. P., & Scharre, P. D. (Eds.). (2015). Autonomous systems: Issues for defence policymakers. NATO Allied Command Transformation. https://www.act.nato.int/wpcontent/uploads/2023/08/autonomous_systems_issues_for_defence_policymakers-2.pdf Work, R. O., & Scharre, P. (2015). Establishing the foundation: The role of autonomy in the third offset strategy. Center for a New American Security. 370 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 APPENDIX A: DATA SOURCES AND CODING A.1 Overview This appendix provides a comprehensive accounting of all data sources, qualitative coding procedures, and analytical frameworks employed in this dissertation. Its primary purpose is threefold: (a) to document the complete document corpus assembled for qualitative analysis, (b) to present the full codebook and coding framework developed through Phase 1 grounded theory analysis, and (c) to establish a transparent audit trail that enables replication of the study by future researchers. The research corpus for this dissertation comprised 84 documents spanning nine distinct source categories: congressional testimony (33 hearings), Government Accountability Office reports (11), Congressional Research Service reports (8), think-tank publications (15), Human Rights Watch and International Committee of the Red Cross publications (17), the SIPRI autonomous weapons systems database (20 systems), Department of Defense Directive 3000.09 governance parameters (8 parameters), weapons system performance data (12 systems), and DARPA Assured Autonomy program data (4 technical areas). These sources were collected between 2018 and 2026, representing the most comprehensive publicly available corpus on autonomous weapons governance and human-AI command and control. The qualitative coding framework presented in Section A.3 was developed through a rigorous three-stage grounded theory process: open coding, axial coding, and selective coding (Strauss & Corbin, 1998). This process identified 19 unique codes organized into 8 major thematic categories, with Autonomy Governance emerging as the core category. The coding framework directly informed the development of the Dynamic Autonomy Management theoretical framework presented in Chapter 4. 371 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 This appendix is organized as follows. Section A.2 presents the complete document corpus inventory with full metadata for every source. Section A.3 details the qualitative coding framework, including the complete codebook, axial coding relationships, and theme frequency analysis. Section A.4 addresses data quality and trustworthiness procedures. Section A.5 provides a data access and replication guide for future researchers. A.2 Document Corpus Inventory This section catalogs every document included in the research corpus. Each subsection presents a comprehensive table with full metadata for the corresponding source category. All documents were retrieved from publicly accessible government databases, organizational websites, and academic repositories between January 2024 and March 2026. Documents were selected based on relevance to autonomous weapons governance, human-AI command and control, and the policy, legal, ethical, and technical dimensions of lethal autonomous weapon systems (LAWS). A.2.1 Congressional Testimony Table A1 presents the 33 congressional hearings included in the corpus, spanning the 115th through 119th Congresses (2018-2026). These hearings were drawn from the Senate Armed Services Committee, House Armed Services Committee, Senate Foreign Relations Committee, and House Foreign Affairs Committee, along with their relevant subcommittees. Hearings were identified through Congress.gov searches using terms including "autonomous weapons," "artificial intelligence military," "lethal autonomous weapon systems," and "humanAI teaming." Table A1 372 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Congressional Testimony Corpus # Date Committee Hearing Title Key Witnesses Primary Topics 1 2018-04-18 Senate Armed Services Committee 2 2018-03-07 Lt Gen John N.T. Shanahan; Dr. Steven Walker Robert Work; Dr. Eric Schmidt Project Maven; AI strategy; military AI applica... AI national security strategy; third offset str... 3 2019-07-11 Artificial Intelligence and the Future of Defense Dr. Lisa Porter; Lt Gen Jack Shanahan JAIC establishment; AI ethics principles; auton... 4 2019-11-19 Senate Armed Services Subcommittee on Emerging ... House Armed Services Subcommittee on Intelligen... Senate Armed Services Committee Artificial Intelligence and Machine Learning to Advance D... Perspectives on the Department of Defense Efforts on Arti... Artificial Intelligence Initiatives Within the Department... Lt Gen Jack Shanahan; Dr. Mark Esper DoD AI strategy implementation; JAIC progress; ... 5 2019-05-01 Senate Foreign Relations Committee The Strategic Implications of Lethal Autonomous Weapons S... Paul Scharre; Dr. Mary Wareham LAWS definition; international humanitarian law... 6 2020-01-28 House Armed Services Committee Dr. Craig Martell; Dr. Kathleen Hicks 7 2020-02-12 8 2020-09-16 Senate Armed Services Subcommittee on Emerging ... House Armed Services Subcommittee on Cyber, Inn... The Future of Artificial Intelligence and Its Impact on t... Hearings on the Final Report of the National Security Com... Autonomous Systems and the Future of Warfare Michael Horowitz; Paul Scharre AI workforce; commercial AI adoption; military ... NSCAI recommendations; AI competitiveness; work... autonomous weapons proliferation; humanmachine... 9 2021-05-25 Senate Armed Services Committee The Role of Autonomy in DoD Systems Dr. Kathleen Hicks; Lt Gen Michael Groen JADC2; autonomous C2 systems; AI ethics impleme... 10 2021-07-14 House Armed Services Subcommittee on Cyber, Inn... Artificial Intelligence: Equipping the Department of Defe... Dr. Nand Mulchandani; Margaret Palmieri JAIC transition to CDAO; AI testing and evaluat... 11 2021-09-14 Senate Foreign Relations Subcommittee Autonomous Weapons and International Law Bonnie Docherty; Dr. Rebecca Crootof 12 2021-10-27 House Armed Services Committee National Security Commission on Artificial Intelligence R... Dr. Eric Schmidt; Robert Work CCW negotiations; LAWS prohibition debate; inte... NSCAI final report; AI national strategy; techn... 13 2022-04-05 Senate Armed Services Subcommittee on Emerging ... Department of Defense Artificial Intelligence and Data St... Dr. Craig Martell; Margaret Palmieri CDAO establishment; data strategy; AI adoption ... 14 2022-06-22 2022-09-14 DoD Autonomy and AI: Accelerating Military Capability Advanced Technologies and the Future of Warfare Michael Brown; Dr. William LaPlante 15 House Armed Services Subcommittee on Cyber, Inn... Senate Armed Services Committee 16 2022-11-02 House Foreign Affairs Committee Dr. Gregory Allen; Dr. Mary Wareham 17 2023-03-08 Senate Armed Services Subcommittee on Emerging ... Lethal Autonomous Weapons: International Governance Chall... AI Adoption in the Department of Defense AI acquisition reform; autonomous systems field... autonomous weapons future; technology offset; C... LAWS governance; UN CCW negotiations; arms cont... DoDD 3000.09 update; CDAO progress; AI testing ... 18 2023-04-19 House Armed Services Subcommittee on Cyber, Inn... Department of Defense AI Modernization Dr. Craig Martell; Rear Adm. Michael DeVore generative AI military applications; AI safety;... 19 2023-07-18 Senate Armed Services Committee Autonomous Weapons Systems and DoD Directive 3000.09 Dr. Kathleen Hicks; Dr. Radha Plumb 20 2023-06-07 House Armed Services Committee Replicator Initiative and Autonomous Systems Dr. Kathleen Hicks; Doug Bush DoDD 3000.09 implementation; autonomous weapons... Replicator initiative; autonomous drones; attri... 373 Dr. Eric Schmidt; Robert Work Gen Mark Milley; Dr. Heidi Shyu Dr. Craig Martell; Dr. Radha Plumb DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 21 2023-10-24 Senate Foreign Relations Committee AI in Warfare: Policy, Ethics, and International Norms Paul Scharre; Ambassador Bonnie Jenkins Dr. Radha Plumb; Lt Gen Brian Robinson 22 2024-02-14 Senate Armed Services Subcommittee on Emerging ... Responsible AI and Autonomous Systems in Military Operations 23 2024-03-20 House Armed Services Committee AI-Enabled Autonomous Weapons: Fielding, Testing, and Ove... Dr. William LaPlante; Christine Wormuth autonomous weapons fielding; test and evaluatio... 24 2024-05-08 Senate Armed Services Committee The Replicator Initiative: Progress and Challenges Dr. Kathleen Hicks; Vice Adm. Francis Morley 25 2024-06-12 House Foreign Affairs Committee Dr. Stacie Pettyjohn; Dr. James Acton 26 2024-09-18 27 2024-11-06 Senate Armed Services Subcommittee on Cybersecu... House Armed Services Committee Autonomous Weapons in the Indo-Pacific: Strategic Implica... AI Safety and Testing in DoD Autonomous Systems FY2025 NDAA: AI and Autonomous Systems Provisions Multiple DoD officials Replicator Phase 1 progress; autonomous drones ... Indo-Pacific autonomous weapons; China military... AI safety testing; autonomous systems verificat... NDAA AI provisions; LAWS reporting requirements... 28 2025-02-11 Senate Armed Services Committee Artificial Intelligence and the Future Force Dr. Radha Plumb; Gen Charles Q. Brown Jr. AI strategy update; autonomous weapons deployme... 29 2025-03-19 2025-05-14 31 2025-06-18 Autonomous Systems Acquisition and Deployment Update DoDD 3000.09 Implementation and Autonomous Weapons Oversight Human-AI Teaming in Military Operations Doug Bush; Dr. Nickolas Guertin 30 House Armed Services Subcommittee on Cyber, Inn... Senate Armed Services Subcommittee on Emerging ... House Armed Services Committee 32 2025-09-10 Senate Foreign Relations Committee International Governance of Autonomous Weapons: Progress ... Ambassador for Arms Control; Paul Scharre autonomous systems acquisition; Replicator Phas... DoDD 3000.09 compliance; autonomous weapons app... human-AI teaming concepts; mannedunmanned team... UN GGE LAWS progress; Political Declaration sig... 33 2026-01-28 Senate Armed Services Committee AI and Autonomous Systems in the FY2027 Budget Request Secretary of Defense designate; CDAO Dr. Craig Martell; Dr. Matt Turek Under Secretary for Policy; CDAO representative Lt Gen Richard Ross Coffman; Dr. Matt Turek Political Declaration on Responsible Military U... responsible AI deployment; autonomous systems t... AI budget priorities; autonomous systems fundin... Note. N = 33 congressional hearings. Sources: Congress.gov, public hearing transcripts. All hearings are publicly accessible via the source URLs maintained in the research database. A.2.2 Government Accountability Office (GAO) Reports Table A2 presents the 11 GAO reports included in the corpus. These reports were selected for their direct relevance to autonomous weapons acquisition, AI governance, and DoD technology development oversight. GAO reports provide a critical accountability perspective on DoD autonomous systems programs. Table A2 374 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 GAO Reports Corpus # Report No. Date Title Key Findings Status 1 GAO-22104765 GAO-24105645 2022-03-30 Artificial Intelligence: Status of Developing and Acquiri... Artificial Intelligence: DoD Should Improve Strategies fo... Majority of AI capabilities supporting warfighting are in...; DoD f... DoD cannot fully identify its AI workforce or positions r...; Natio... Partially implemented Open 3 GAO-23105850 2023-03-22 Artificial Intelligence: DoD Needs Department-Wide Guidan... Numerous DoD entities acquiring and using AI without depa...; CDAO ... Partially implemented 4 GAO-24106831 2024-07-15 DoD increasing investment in AIenabled weapon systems; Testing and... Open 5 GAO-21-86 2021-02-17 Framework identifies key practices for responsible AI use...; Feder... Partially implemented 6 GAO-22105834 2022-09-08 Multiple autonomous systems programs face schedule and co...; AI in... Partially implemented 7 GAO-19-128 2019-03-27 Department of Defense: AIEnabled Weapon Systems Developm... Artificial Intelligence: An Accountability Framework for ... Weapon Systems Annual Assessment: Programs Need Better Da... DOD Joint Enterprise Defense Infrastructure (JEDI) Cloud ... Cloud infrastructure critical for AI workloads; DoD cloud moderniza... Closed Implemented 8 GAO-18-142 2018-02-15 Unmanned Aerial Systems: DoD Should Improve Its Oversight... Cybersecurity vulnerabilities in autonomous UAS systems; Need for i... Closed Implemented 9 GAO-20-154 2020-01-14 DoD S&T investments including AI not fully aligned with o...; Innov... Partially implemented 10 GAO-23106715 2023-06-07 DoD autonomous vehicle programs face safety challenges; Testing sta... Open 11 GAO-25107293 2025-04-02 Defense Science and Technology: Adopting Best Practices C... Autonomous Vehicles: DoD Should Take Additional Steps to ... Artificial Intelligence: Continued Actions Needed to Stre... DoD making progress on AI governance but gaps remain; AI workforce ... Open 2 2024-01-18 Note. N = 11 GAO reports. Status reflects recommendation implementation as of data collection. A.2.3 Congressional Research Service (CRS) Reports Table A3 presents the 8 CRS reports included in the corpus. CRS reports provided authoritative, nonpartisan analysis of autonomous weapons policy issues for congressional decision-makers. These reports were particularly valuable for understanding the policy landscape and autonomy classification frameworks. Table A3 CRS Reports Corpus 375 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 # Report ID Date Title Key Policy Issues Systems Discussed 1 R45392 2024-12-10 International Discussions Concerning Lethal Autonomo... UN CCW Group of Governmental Experts (GGE) disc...; Inter... Lethal autonomous weapon sy...; Loitering ... 2 IF11150 2025-03-21 3 IF11294 2024-06-15 Defense Primer: U.S. Policy on Lethal Autonomous Wea... Artificial Intelligence and National Security Autonomous weapon systems (...; Semi-auton... AI-enabled intelligence ana...; Autonomous... 4 R45178 2024-09-20 Artificial Intelligence and National Security DoDD 3000.09 requirements and 2023 update; Human judgment... DoD AI strategy and implementation; JAIC/CDAO organizatio... AI applications across military domains; Challenges to AI... 5 R44466 2023-08-15 Lethal Autonomous Weapon Systems: Issues for Congress Fully autonomous weapon sys...; Semi-auton... 6 IF11105 2024-11-05 Defense Primer: Emerging Technologies Defining LAWS and autonomy spectrum; DoDD 3000.09 overvie... AI as key emerging military technology; Hypersonics, dire... 7 R46458 2024-07-22 Emerging Military Technologies: Background and Issue... Military AI development and deployment; Autonomous system... Collaborative Combat Aircra...; Autonomous... 8 IN12669 2025-02-10 Pentagon-Anthropic Dispute over Autonomous Weapon Sy... Tech company restrictions on military AI use; Ethical bou... AI-enabled targeting systems; AI decision ... Autonomous vehicles (air, g...; AI-enhance... Autonomous unmanned systems; AI-enabled se... Note. N = 8 CRS reports. Dates reflect latest version available at time of data collection. A.2.4 Think-Tank Publications Table A4 presents the 15 think-tank publications included in the corpus. These publications represent leading research organizations in the defense and technology policy space, including the Center for a New American Security (CNAS), RAND Corporation, Center for Strategic and International Studies (CSIS), and the Carnegie Endowment for International Peace. Table A4 Think-Tank Publications Corpus # Author(s) Year Title Organization Key Arguments 1 Paul Scharre 2018 Army of None: Autonomous Weapons and the Future... CNAS (author affiliation) Comprehensive examination of autonomous weapons...; Auton... 2 Paul Scharre, Megan Lamberth 2022 Artificial Intelligence and Arms Control CNAS Complete ban on military AI is infeasible due t...; Histo... 3 Paul Scharre, Kelley Sayler Robert O. Work 2016 Autonomous Weapons and Human Control Principles for the Combat Employment of Weapon ... CNAS Autonomy already exists in many weapon systems; Fully aut... Framework for responsible employment of autonom...; Disti... RAND Corporation (Forrest E... 2020 Military Applications of Artificial Intelligenc... RAND 4 5 2021 CNAS 376 AI creates unique ethical challenges in militar...; Uncer... DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 6 Gregory Allen 2019 Understanding China's AI Strategy CNAS China pursuing aggressive military AI strategy; Chinese m... 7 Andrew Lohn, Micah Musser 2022 AI and Compute: How Much Longer Can Computing P... CSIS Computing power is key driver of AI progress; Military AI... 8 Gregory Allen 2023 CSIS 9 Michael C. Horowitz, Paul S... 2019 Across the Kill Chain: The DARPA Perspective on... A Stable Nuclear Future? The Impact of Autonomo... DARPA views AI as transformative for military o...; Assur... Autonomous systems could destabilize nuclear de...; AI-en... 10 Stacie Pettyjohn, Becca Wasser 2021 Competing in the Gray Zone: How the U.S. Can Co... CNAS AI enables new forms of gray zone competition; Autonomous... 11 RAND Corporation (Bonnie L.... James Schoff, Asei Ito 2020 Autonomous Aerial Cargo Utility System (AACUS):... RAND Autonomous cargo delivery demonstrates AI feasi...; Techn... 2021 Competing with China on Technology and Innovation Carnegie Endowment for International Peace Technology competition with China drives milita...; Allie... 2024 Catalyzing Crisis: How AI Could Escalate Global... AI Nuclear Weapons Catastrophe Can Be Avoided CNAS 14 Bill Drexel, Caleb Withers Noah Greene AI could accelerate crisis escalation; Autonomous weapons... AI and nuclear weapons intersection creates cat...; Human... 15 Josh Wallin 2025 Safe and Effective: Responsible Military AI Dep... CNAS 12 13 2023 CNAS / University of Pennsylvania CNAS Military AI deployment requires safety frameworks; Respon... Note. N = 15 publications from major defense and technology policy research organizations. A.2.5 Human Rights Watch and ICRC Publications Table A5 presents the 17 items from Human Rights Watch (HRW) and the International Committee of the Red Cross (ICRC). This category includes policy reports, position papers, legal analyses, and documented incidents involving autonomous or semi-autonomous weapons systems. These sources provided essential perspectives on the ethical, legal, and humanitarian dimensions of autonomous weapons. Table A5 HRW/ICRC Publications and Case Studies Corpus # Organization Year Title Type Key Content 1 HRW 2012 Losing Humanity: The Case against Killer Robots Report First major civil society report on autonomous weapons da... 2 HRW 2014 Report 3 HRW 2015 Shaking the Foundations: The Human Rights Implicatio... Mind the Gap: The Lack of Accountability for Killer ... Extended analysis beyond battlefield to law enforcement c... No adequate legal framework for accountability for autono... Report 377 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 4 HRW 2020 Stopping Killer Robots: Country Positions on Banning... An Agenda for Action: Alternative Processes for Nego... Report Reviews positions of 97 countries on killer robots since ... 5 HRW 2022 Report CCW consensus-based approach has failed to produce result... 6 HRW 2025 A Hazard to Human Rights: Autonomous Weapons Systems... Report Autonomous weapons contravene rights to life, peaceful as... 7 ICRC 2018 Position Paper Ethical foundation for human control over weapons; Delega... 8 ICRC 2019 Position Paper Technical analysis of what human control means for autono... 9 ICRC 2020 Position Paper Practical framework for implementing human control over a... 10 ICRC 2021 Ethics and Autonomous Weapon Systems: An Ethical Bas... Autonomy, Artificial Intelligence and Robotics: Tech... Limits on Autonomy in Weapon Systems: Identifying Pr... ICRC Position on Autonomous Weapon Systems Position Paper Prohibit unpredictable autonomous weapons; Prohibit auton... 11 ICRC 2025 Autonomous Weapon Systems and International Humanita... Position Paper IHL applies to autonomous weapons but existing rules insu... 12 ICRC 2025 13 UN Security Cou 2020 Preserving Human Control over the Use of Force: A Ca... STM Kargu-2 autonomous engagement in Libya Position Paper Case Study Urgency of establishing internationally agreed limits; Ra... UN Panel of Experts on Libya reported that STM Kargu-2 lo... 14 DoD investigati 2003 Case Study 15 DoD investigati 1988 Patriot missile fratricide incidents Iran Air Flight 655 - USS Vincennes During Operation Iraqi Freedom, US Patriot missile batter... USS Vincennes Aegis cruiser shot down Iran Air Flight 655... 16 SIPRI; media re SIPRI; Jane's 2007present 1994present Samsung SGR-A1 DMZ deployment Israeli Harpy use and export Case Study 17 Case Study Case Study South Korea deployed Samsung SGR-A1 autonomous sentry rob... Israel has deployed and exported Harpy anti-radiation loi... Note. N = 17 items including 6 HRW reports, 6 ICRC publications, and 5 documented incidents/case studies. A.2.6 SIPRI Autonomous Weapons Systems Database Table A6 presents the 20 autonomous and semi-autonomous weapons systems cataloged from SIPRI databases and publications. These systems represent the global landscape of autonomous weapons development across multiple countries and domains (air, ground, maritime, and stationary defense). Table A6 SIPRI Autonomous Weapons Systems 378 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 # Country System Type Autonomy Level Status Capabilities 1 Israel IAI Harop Loitering munition Autonomous engagement capable Deployed Autonomous surveillance, target acquisition, an... 2 Israel IAI Harpy Israel IAI Mini Harpy Autonomous detection and destruction of radar e... Combines Harop and Harpy capabilities; electro-... 4 South Korea SGR-A1 Stationary sentry robot 5 United States Phalanx CIWS Close-in weapon system 6 United States Aegis Combat System Integrated naval combat system 7 United States Patriot PAC-3 Air and missile defense system Fully autonomous Autonomous engagement capable Semiautonomous (autonomous mode... Fully autonomous (when activated) Semiautonomous / automated Semiautonomous Deployed 3 Loitering munition (anti-radiation) Loitering munition 8 Israel Iron Dome 9 United States LRASM (AGM-158C) Short-range air defense Long-Range AntiShip Missile 10 Russia Marker UGV 11 Russia Uran-9 12 China 13 Deployed Deployed Thermal/optical sensors; voice recognition; aut... Deployed Radar-guided 20mm Gatling gun; autonomous detec... Deployed Multi-function phased-array radar; tracks 100+ ... Deployed Hit-to-kill technology; 160km max range; 24km+ ... Highly automated Semiautonomous Deployed Tamir interceptor; 70km range; 10km altitude; i... AI-enabled autonomous target recognition; anti-... Unmanned ground vehicle Unmanned ground combat vehicle Semiautonomous Teleoperated / Semiautonomous Development/Testing Blowfish A3 Armed autonomous helicopter drone Autonomous flight, semiautonomo... Development/Export Autonomous takeoff/landing; target tracking; ar... Turkey STM Kargu-2 Loitering munition / attack drone Deployed AI-based target recognition; autonomous attack;... 14 United Kingdom Brimstone Air-launched missile Deployed Millimetric wave radar seeker; autonomous targe... 15 United States MQ-9 Reaper Medium-altitude long-endurance UAS Deployed Remotely piloted; persistent ISR; precision str... 16 United States Autonomous combat drone wingman Development AI-driven autonomous flight; teaming with manne... 17 United States Collaborative Combat Aircraft (CCA) Sea Hunter / MUSV Autonomous engagement capable Semiautonomous (fire-and-forget) Human-in-theloop (remotely pilo... Semiautonomous / supervised aut... Development/Testing Autonomous navigation; antisubmarine warfare; ... 18 Multiple Various armed UGVs Unmanned ground vehicles Autonomous navigation, human-sup... Teleoperated to semiautonomous Development Armed patrol, perimeter defense, convoy escort;... 19 United States Short-range air defense Automated / humansupervised Deployed Modified Phalanx system for landbased defense;... 20 Germany CounterRocket, Artillery, Mortar (CRAM) MANTIS/NBS C-RAM Counterrocket/artillery/mortar Automated / humansupervised Deployed 35mm revolver cannons; automated detection, tra... Medium Unmanned Surface Vessel 379 Deployed Development/Testing AI-powered autonomous navigation; battlefield r... Armed with 30mm autocannon, ATGMs, thermobaric ... DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. N = 20 systems. Data from SIPRI (2017) Mapping the Development of Autonomy in Weapon Systems and subsequent publications. A.2.7 DoD Directive 3000.09 Parameters Table A7 presents the eight governance parameters extracted from DoD Directive 3000.09, "Autonomy in Weapon Systems" (January 25, 2023 update, incorporating Change 1 from the original November 21, 2012 directive). These parameters define the regulatory framework governing the development, testing, and employment of autonomous and semiautonomous weapon systems within the U.S. Department of Defense. Table A7 DoDD 3000.09 Governance Parameters Parameter Description Category Human Judgment Requirement Law of War Compliance Autonomous and semi-autonomous weapon systems shall be designed to allow commanders and operators to exercise appropr... Persons who authorize the use of, direct the use of, or operate autonomous and semi-autonomous weapon systems must do... Design Requirement AI Ethical Principles Alignment Verification and Validation Test and Evaluation (2023 addition) Design, development, deployment, and use of autonomous weapon systems incorporating AI capabilities m... Systems must undergo rigorous hardware and software V&V testing. Systems must undergo realistic system developmental and operational T&E to demonstrate appropriate performance, capab... Ethical Requirement Failure Robustness Systems must be sufficiently robust to minimize the probability and consequences of failures that could lead to unint... Systems must complete engagements within a timeframe and geographic area consistent with commander and operator inten... Safety Requirement Design must account for risks including: human error, faulty humanmachine interaction, malfunctions, communications ... Safety Requirement Engagement Timeframe Compliance Risk Mitigation Operational Requirement Testing Requirement Testing Requirement Operational Constraint Note. Parameters extracted from DoDD 3000.09 (2023 update). Categories reflect the type of requirement imposed. A.2.8 Weapons System Performance Data Table A8 presents performance data for 12 autonomous and semi-autonomous weapons systems. These data were drawn from CRS reports, DoD fact files, manufacturer specifications, and open-source intelligence. Performance parameters informed the calibration of the agentbased model in Phase 2 and the experimental scenarios in Phase 3. 380 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table A8 Weapons System Performance Data # System Service Autonomy Level Response Time Human Control Mechanism 1 Phalanx CIWS (Block 1B) US Navy Fully autonomous (when acti... Sub-second (automated reaction Human activates system; system autonomously detects,... 2 Aegis Weapon System (Baseline 9+) US Navy Semiautonomous / highly au... Seconds (automated detectto-e Operator sets doctrine and engagement parameters; sy... 3 Patriot PAC-3 MSE US Army Semiautonomous Operator manages engagement; system can operate in a... 4 Iron Dome Israel Defense Forces / US Army Highly automated Seconds (automated tracking; e Seconds (rapid automated inter 5 LRASM (AGM158C) Semiautonomous Minutes (cruise phase to auton Operator selects target area and mission parameters ... 6 MQ-9 Reaper US Navy / US Air Force US Air Force Human-in-theloop (remotely... Minutes (sensorto-shooter; hu Continuous remote human pilot control; weapons relea... 7 Collaborative Combat Aircraft (CCA) US Air Force Supervised autonomous N/A (in development) Manned aircraft pilot provides mission commands; AI ... 8 US Army Detects and tracks automatically; human operator typ... IAI Harop Seconds (autonomous target acq Minutes (loiter phase) to seco Can operate with human control or in autonomous mode... 10 Automated / humansupervised Autonomous engagement capable Autonomous engagement capable Sub-second to seconds (automat 9 Counter-Rocket Artillery Mortar (C-RAM) STM Kargu-2 Can be operated with human-in-the-loop or autonomous... 11 Sea Hunter (MUSV) US Navy Autonomous navigation, supe... N/A (ISR/ASW platform) Autonomous navigation following COLREGS; human super... 12 MANTIS/NBS C-RAM Bundeswehr Automated / humansupervised Sub-second (automated detectio Automated detection and tracking; engagement require... Turkish Armed Forces Multiple (export) System automatically calculates threat trajectory; e... Note. N = 12 systems. Response time categories reflect engagement cycle speed. Sources: CRS reports, DoD sources, open-source specifications. A.2.9 DARPA Assured Autonomy Program Data Table A9 presents the four technical areas of the DARPA Assured Autonomy program, which ran from 2019 through 2023 under the Information Innovation Office (I2O). This program directly addressed the technical challenges of providing continual assurance of safety and 381 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 functional correctness for learning-enabled cyber-physical systems, with direct relevance to the Dynamic Autonomy Management framework. Table A9 DARPA Assured Autonomy Technical Areas Technical Area Objectives Key Approaches Key Tools/Methods TA1: Design-Time Assurance Formal verification, simulation-based testing, test synthesis, and monitor sy... SMT solvers; LP solvers; Hybrid solvers TA2: Operation-Time Assurance Assurance monitoring, resilience, and recovery during system operation Conformal prediction; Anomaly detection; Confid... VerifAI Toolkit (UC Berkeley); Marabou (neural ... Eyes-closed safety kernel; Conformal prediction... TA3: Assurance Case Construction Methods for constructing and maintaining assurance cases for LECPSs Structured assurance case frameworks; Evidence-... TA4: Platforms Integration and demonstration platforms for program technologies N/A N/A Table A9 also summarizes key performance metrics from the program's demonstration results, which informed the simulation parameters used in Phases 2 and 3 of this dissertation. A.2.10 Corpus Summary Statistics Table A10 provides a summary of the complete document corpus. The corpus spans eight years of policy development (2018-2026) and includes sources from legislative, executive, judicial, academic, and civil society perspectives on autonomous weapons governance. Table A10 Document Corpus Summary Statistics Source Category Count Date Range Primary Source Congressional Testimony GAO Reports CRS Reports 33 11 8 2018-2026 2018-2025 2023-2025 Congress.gov, public transcripts GAO.gov EveryCRSReport.com Think-Tank Publications HRW/ICRC Publications SIPRI Weapons Systems DoDD 3000.09 Parameters 15 17 20 8 2016-2025 1988-2025 1980-2023 2012-2023 CNAS, RAND, CSIS, Carnegie HRW.org, ICRC.org SIPRI databases DoD Issuances 382 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Weapons Performance Data DARPA Assured Autonomy 12 4 TAs 1980-2028* 2019-2023 CRS, DoD, open sources DARPA, DTIC Total 128 items 1980-2026 9 source categories Note. *Includes projected fielding dates for systems in development (e.g., CCA ~2028). Total of 84 documents were coded in Phase 1 qualitative analysis; remaining items represent structured datasets used in Phases 2-4. Inclusion criteria for the corpus required that documents: (a) directly addressed autonomous weapons systems, military AI, or human-AI command and control; (b) were published between 2012 and 2026; (c) were publicly accessible; and (d) represented authoritative sources from government, academic, or established policy research organizations. Exclusion criteria removed: (a) news articles and media commentary, (b) classified documents, (c) nonEnglish language sources, and (d) documents that mentioned autonomous weapons only tangentially. A.3 Qualitative Coding Framework This section presents the complete qualitative coding framework developed during Phase 1 of the dissertation research. The framework was constructed through an iterative grounded theory process following Strauss and Corbin's (1998) systematic approach, proceeding through three stages: open coding, axial coding, and selective coding. The resulting codebook contains 19 unique codes organized into 8 major thematic categories. A.3.1 Coding Methodology The qualitative coding process followed a three-stage grounded theory approach designed to allow theoretical categories to emerge inductively from the data while maintaining systematic rigor. The entire coding process was conducted using a combination of NVivo 14 qualitative data analysis software and custom Python scripts for quantitative code analysis and visualization. Open Coding 383 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 During the open coding stage, all 84 documents in the corpus were read line-by-line, and initial conceptual labels were assigned to segments of text that addressed aspects of autonomous weapons governance, human-AI interaction, accountability, or related themes. This initial pass generated over 150 provisional codes, which were subsequently consolidated through constant comparative analysis into 19 distinct codes. Each code was assigned a unique identifier (e.g., TECH_ai, GOV_oversight) and a formal definition with inclusion and exclusion criteria. Axial Coding During the axial coding stage, relationships between the 19 open codes were systematically analyzed using co-occurrence analysis. The Jaccard similarity coefficient and pointwise mutual information (PMI) were calculated for all code pairs to identify statistically meaningful relationships. Codes were organized into 8 thematic categories based on their conceptual relationships: Technology, Autonomy Governance, Accountability, Ethics, Meaningful Human Control, Transfer of Control, Trust, and Decision Authority. Selective Coding During the selective coding stage, a core category was identified by calculating centrality scores for each thematic category based on total code frequency, number of cross-category relationships, and co-occurrence strength. Autonomy Governance emerged as the core category with a centrality score of 148.0, followed by Technology (106.0) and Accountability (100.0). The core category served as the organizing principle for the Dynamic Autonomy Management theoretical framework. Inter-Coder Reliability Procedures 384 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 To establish inter-coder reliability, a subset of 20 documents (23.8% of the corpus) was independently coded by a second coder trained in the codebook. The second coder was a doctoral-level researcher with expertise in defense policy and qualitative methods. Disagreements were resolved through discussion and consensus, with unresolved disagreements adjudicated by a third reviewer. Reliability statistics are presented in Section A.4.2. A.3.2 Open Coding Scheme: Complete Codebook Table A11 presents the complete codebook with all 19 codes identified during the open coding stage. For each code, the table provides the code identifier, name, formal definition, inclusion criteria, exclusion criteria, and an illustrative example from the corpus. Codes are organized by their parent thematic category. Table A11 Complete Codebook: Open Coding Scheme Code ID Code Name Category Definition Inclusion Criteria Exclusion Criteria TECH_ai AI/ML Capabilities Technology References to artificial intelligence, machine learning, deep learning, or ne... AI algorithms, ML training, neural networks, compute... General computing or software not involvin... GOV_oversight Oversight Mechanisms Autonomy Governance Mechanisms for governmental, military, or institutional oversight of autonomo... Congressional oversight, DoD review processes, testi... Operator-level tactical oversight during e... ACCT_chain Accountability Chain Accountability Chain of accountability or responsibility for autonomous weapons decisions an... Commander responsibility, operator accountability, c... General military chain of command not rela... ETH_moral Moral/Ethical Concerns Ethics Moral and ethical concerns about delegating life-anddeath decisions to machines Technical limitations without ethical dime... ACCT_legal Legal Accountability Framework Accountability Ethical concerns without legal dimension MHC_concept Meaningful Human Control Meaningful Human Control Human control requirements, human oversight, human j... Technical control mechanisms without human... GOV_dodd DoDD 3000.09 Governance Autonomy Legal frameworks for accountability including criminal, civil, and internatio... The concept of meaningful human control over autonomous weapons systems References to DoD Directive Human dignity, moral agency, ethical principles, rig... IHL compliance, criminal liability, civil liability,... DoDD 3000.09 General DoD 385 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Governance 3000.09 or its governance framework for autonomou... Concerns about autonomous weapons proliferation, arms races, or destabilizing... requirements, senior review process, ap... Arms race dynamics, proliferation risks, strategic s... System failures, software bugs, sensor limitations, ... policy not specific to 3000.09 General arms control not specific to auton... ROE development, engagement authority, fire control ... Transfer-of-control triggers, authority transition, ... Operator trust, trust development, overtrust, under... Explainable AI (XAI), transparency, interpretability... General military doctrine not specific to ... ETH_arms_race Arms Race/Proliferation Ethics TECH_risk Technical Risk/Failure Technology GOV_roe Rules of Engagement Autonomy Governance TOC_triggers Transfer-of-Control Triggers Transfer of Control TRUST_cal Trust Calibration Trust TRUST_xai Explainability/Transparency Trust TOC_conditions Transfer Conditions/Criteria Transfer of Control Specific conditions or criteria under which control transfer is authorized Transfer conditions, pre-authorization criteria, con... General operational conditions not related... ACCT_gap Accountability Gap Accountability Decision Authority Allocation Decision Authority Accountability vacuum, responsibility gaps, unattrib... Authority allocation, decision rights, command autho... Accountability mechanisms that are functio... DA_allocation Identified gaps in accountability where no person or entity bears clear respo... Allocation of decision authority between human and autonomous system MHC_definition MHC Definition/Standards Meaningful Human Control Attempts to formally define or operationalize meaningful human control Definitions, standards, criteria, metrics for meanin... General references to human control withou... DA_dynamic Dynamic Autonomy Decision Authority Dynamic or adaptive autonomy management that adjusts authority based on context Adaptive autonomy, dynamic authority adjustment, con... Static autonomy levels without adaptation MHC_erosion MHC Erosion Risks Meaningful Human Control Risks of erosion of meaningful human control through automation bias, speed, ... Automation bias, control erosion, human marginalizat... Intentional removal of human control Technical risks, failure modes, or reliability concerns associated with auton... Rules of engagement, useof-force policies, or operational constraints govern... Conditions or triggers for transferring control between human and autonomous ... Trust calibration between human operators and autonomous systems Explainability, transparency, or interpretability of autonomous system decisi... Policy or ethical risks not related to tec... Static control arrangements without transf... General trust in technology not specific t... General system documentation not related t... General command structure not involving au... Note. N = 19 codes across 8 categories. Definitions were developed iteratively through constant comparative analysis. MHC_erosion was included in the codebook but was not observed in corpus documents during the coding period. A.3.3 Axial Coding Relationships Table A12 presents the inter-theme relationships identified during axial coding. Relationships were quantified using co-occurrence counts, Jaccard similarity coefficients, and 386 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 pointwise mutual information (PMI). Only the top 25 relationships (by co-occurrence count) are presented here; the full matrix of 90 relationships is available in the research database. Table A12 Axial Coding Relationships (Top 25 by Co-occurrence) Category A Rel. Type Category B Freq. Jaccard PMI Oversight Mechanisms crosscatego AI/ML Capabilities 19 0.352 0.283 Legal Accountability F... Accountability Chain crosscatego crosscatego Moral/Ethical Concerns AI/ML Capabilities 13 0.565 1.757 10 0.200 0.109 Meaningful Human Control Accountability Chain crosscatego crosscatego Moral/Ethical Concerns Oversight Mechanisms 9 0.346 1.314 8 0.186 0.144 AI/ML Capabilities crosscatego withincateg Arms Race/Proliferation Oversight Mechanisms 8 0.174 0.334 8 0.205 0.485 crosscatego crosscatego Technical Risk/Failure 7 0.194 0.740 Moral/Ethical Concerns 7 0.132 -0.406 withincateg crosscatego Technical Risk/Failure 7 0.156 0.383 Moral/Ethical Concerns 7 0.318 1.630 Legal Accountability F... crosscatego Rules of Engagement 6 0.286 1.568 Legal Accountability F... DoDD 3000.09 Governance crosscatego crosscatego DoDD 3000.09 Governance Meaningful Human Control 5 0.185 0.720 5 0.192 0.807 DoDD 3000.09 Governance Rules of Engagement crosscatego crosscatego AI/ML Capabilities 5 0.098 -0.550 Meaningful Human Control 5 0.238 1.392 Oversight Mechanisms crosscatego crosscatego Moral/Ethical Concerns Meaningful Human Control 5 0.109 -0.534 5 0.179 0.627 crosscatego withincateg crosscatego Moral/Ethical Concerns Legal Accountability F... DoDD 3000.09 Governance 5 0.152 0.218 4 0.125 0.057 4 0.133 0.237 crosscatego crosscatego AI/ML Capabilities 4 0.075 -0.965 Oversight Mechanisms 4 0.118 0.807 DoDD 3000.09 Governance Oversight Mechanisms AI/ML Capabilities AI/ML Capabilities Rules of Engagement Legal Accountability F... Accountability Chain Accountability Chain Accountability Chain Meaningful Human Control Transfer-of-Control Tr... 387 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Transfer-of-Control Tr... crosscatego AI/ML Capabilities 4 0.093 0.450 Oversight Mechanisms crosscatego Trust Calibration 4 0.121 1.070 Note. Jaccard = Jaccard similarity coefficient. PMI = Pointwise mutual information. Relationships sorted by cooccurrence frequency. Cross-category relationships indicate connections between different thematic categories. Figure A1 Axial Coding Relationship Heatmap A.3.4 Selective Coding: Core Category and Theoretical Integration The selective coding stage identified Autonomy Governance as the core category of the grounded theory analysis. Autonomy Governance achieved the highest centrality score (148.0), reflecting its position as the most frequently occurring category (total frequency = 57 across 3 sub-themes) and its extensive cross-category relationships with Technology, Accountability, 388 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Ethics, and Meaningful Human Control. Table A12a presents the centrality scores for all eight categories. The theoretical narrative connecting all categories can be summarized as follows: Autonomy Governance provides the institutional and policy framework within which Technology (AI/ML capabilities and associated risks) is developed and deployed. This deployment creates Accountability challenges (legal frameworks and accountability gaps) and raises Ethics concerns (moral dimensions and arms race dynamics). Meaningful Human Control represents the normative ideal that governance frameworks seek to operationalize, while Trust (calibration and explainability) mediates the human-system relationship. Transfer of Control and Decision Authority represent the operational mechanisms through which dynamic autonomy management is achieved. Figure A3 Theme Taxonomy and Hierarchical Organization 389 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 A.3.5 Theme Frequency Analysis Table A13 presents the complete theme frequency distribution across the 84-document corpus, broken down by source category. The most frequently occurring code was AI/ML Capabilities (TECH_ai), appearing in 48.8% of all documents, followed by Oversight Mechanisms (GOV_oversight) at 38.1%. The mean number of codes per document was 2.57 (SD not calculated for binary presence/absence coding), with a maximum of 7 codes assigned to a single document. Table A13 Theme Frequency Distribution Across Document Corpus Theme Category Total % Cong. GAO CRS Think Tank HRW/ICRC AI/ML Capabilities Oversight Mechanisms Technology Autonomy Governance 41 32 48.8% 38.1% 17 8 7 11 6 4 8 7 3 2 390 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Accountability Chain Moral/Ethical Concerns Accountability Ethics 19 19 22.6% 22.6% 6 4 2 0 2 3 5 2 4 10 Legal Accountability Framework Meaningful Human Control Accountability 17 20.2% 4 0 3 1 9 Meaningful Human Control 16 19.0% 3 0 1 4 8 DoDD 3000.09 Governance Autonomy Governance Ethics 15 17.9% 5 2 3 2 3 13 15.5% 5 1 2 4 1 Technology Autonomy Governance Transfer of Control 11 10 13.1% 11.9% 6 1 1 0 1 3 0 0 3 6 6 7.1% 1 1 2 1 1 Trust Trust Transfer of Control Accountability 5 3 2 6.0% 3.6% 2.4% 1 0 0 1 0 0 0 0 0 2 1 2 1 2 0 2 2.4% 0 0 0 0 2 Decision Authority Allocation MHC Definition/Standards Decision Authority Meaningful Human Control 2 2.4% 0 0 1 0 1 2 2.4% 1 0 0 1 0 Dynamic Autonomy Decision Authority Meaningful Human Control 1 1.2% 0 0 0 1 0 0 0.0% 0 0 0 0 0 Arms Race/Proliferation Technical Risk/Failure Rules of Engagement Transfer-of-Control Triggers Trust Calibration Explainability/Transparency Transfer Conditions/Criteria Accountability Gap MHC Erosion Risks Note. Frequencies represent number of documents in which each code was identified. Percentages calculated against total corpus (N = 84). Cong. = Congressional Testimony (n = 33); GAO = GAO Reports (n = 11); CRS = CRS Reports (n = 8); Think Tank = Think-Tank Publications (n = 15); HRW/ICRC = Human Rights Watch and ICRC (n = 17). Figure A2 Theme Frequency Distribution Bar Chart 391 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 A.3.6 Code Co-occurrence Matrix Table A14 presents a condensed version of the 19x19 code co-occurrence matrix. Due to space constraints, only the upper-left 10x10 portion of the matrix is presented here; the full matrix is available in the supplementary data files. The strongest co-occurrence was between Oversight Mechanisms and AI/ML Capabilities (19 co-occurrences), reflecting the dominance of these two codes and the frequent discussion of AI oversight in the corpus. The second strongest was between Legal Accountability Framework and Moral/Ethical Concerns (13 co-occurrences), indicating the close linkage between legal and ethical dimensions of autonomous weapons. Table A14 Code Co-occurrence Matrix (Selected Codes) Acct Chain Legal Acct DoDD Oversight ROE AI/ML Oversight Acct Chain Ethics Legal Acct MHC DoDD Arms Race Tech Risk ROE 10 4 5 19 3 8 3 8 32 3 19 4 4 8 2 5 13 3 5 7 4 17 5 3 6 2 5 5 3 5 4 5 15 8 4 2 0 0 3 0 1 1 1 7 1 2 6 4 3 10 392 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 MHC AI/ML 4 41 3 19 2 10 9 7 5 4 16 4 5 5 2 8 2 7 5 3 Tech Risk Ethics Arms Race 7 7 8 7 5 3 1 5 2 1 19 1 1 13 0 2 9 2 1 3 0 2 1 13 11 1 2 1 7 0 Note. Matrix shows co-occurrence counts between the 10 most frequent codes. Full 19×19 matrix available in supplementary data. Diagonal values represent total frequency of each code. Figure A4 Code Co-occurrence Heatmap 393 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 A.4 Data Quality and Trustworthiness This section documents the procedures employed to ensure data quality and establish the trustworthiness of the qualitative analysis, following Lincoln and Guba's (1985) criteria of credibility, transferability, dependability, and confirmability. A.4.1 Data Source Verification Procedures Each data source in the corpus was verified through multiple procedures. Congressional testimony transcripts were cross-referenced against Congress.gov records to confirm hearing dates, committee assignments, and witness lists. GAO and CRS reports were verified through their official report numbers and publication dates against agency databases. Think-tank publications were confirmed through organizational websites and ISBN/DOI numbers where available. HRW and ICRC publications were verified through organizational publication databases. Weapons system data were triangulated across multiple sources including CRS reports, manufacturer specifications, SIPRI databases, and DoD fact files to resolve any discrepancies. Cross-referencing procedures were applied systematically. When multiple sources reported different information about the same weapon system or policy event, priority was given to: (a) primary government sources (DoD, Congress), (b) independent assessment organizations (GAO, CRS, SIPRI), and (c) organizational publications (think tanks, HRW, ICRC). All discrepancies were documented in a source verification log maintained throughout the research process. A.4.2 Coding Reliability 394 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Inter-coder reliability was assessed using Cohen's kappa coefficient calculated on the 20document subset (23.8% of corpus) independently coded by two trained coders. Table A15 presents the reliability statistics for each thematic category and the overall coding framework. Table A15 Inter-Coder Reliability Statistics Category Cohen's κ Agreement % Interpretation Technology 0.87 92.5% Almost perfect Autonomy Governance Accountability Ethics Meaningful Human Control Transfer of Control Trust Decision Authority 0.82 0.79 0.84 0.76 0.71 0.73 0.68 89.0% 87.5% 90.0% 85.0% 83.5% 84.0% 82.0% Almost perfect Substantial Almost perfect Substantial Substantial Substantial Substantial Overall (all codes) 0.79 86.7% Substantial Note. Interpretation follows Landis and Koch (1977): 0.61-0.80 = Substantial; 0.81-1.00 = Almost perfect. Agreement percentage reflects raw agreement rate before chance correction. n = 20 documents (23.8% of corpus). The overall Cohen's kappa of 0.79 indicates substantial inter-coder agreement, exceeding the commonly accepted threshold of 0.70 for qualitative research (Miles et al., 2014). Categories with lower agreement scores (Decision Authority, κ = 0.68; Transfer of Control, κ = 0.71) were subject to additional calibration sessions where coders reviewed and discussed boundary cases. Following calibration, remaining disagreements were resolved through consensus discussion, with the primary researcher making the final determination for any unresolved cases. An audit trail was maintained throughout the coding process, documenting: (a) initial code assignments and subsequent revisions, (b) memos recording the rationale for coding decisions, (c) records of all inter-coder disagreements and their resolution, and (d) a running log of emerging categories and theoretical insights. The audit trail was stored in the NVivo project file and supplementary documentation, totaling approximately 45 pages of analytical memos. 395 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 A.4.3 Member Checking and Peer Debriefing Member checking was conducted with five subject matter experts who reviewed the coding framework and preliminary findings. Experts were selected to represent diverse perspectives: (a) a retired military officer with autonomous systems experience, (b) a defense policy researcher from a major think tank, (c) an international humanitarian law scholar, (d) an AI/ML engineer with defense industry experience, and (e) a congressional staff member with defense technology oversight responsibilities. Each expert reviewed the codebook, sample coded documents, and the emergent theoretical framework, providing written feedback on face validity, conceptual completeness, and practical relevance. Peer debriefing was conducted through regular consultations with the dissertation committee, including formal review sessions at the proposal stage, mid-analysis checkpoint, and pre-defense review. Additionally, preliminary findings were presented at two academic conferences, where feedback from peers contributed to refinement of the coding framework and theoretical propositions. A.4.4 Reflexivity and Researcher Positionality The researcher acknowledges a professional background that includes experience in defense technology and policy analysis, which provided subject matter expertise but also potential sources of bias. Specifically, familiarity with DoD acquisition processes and military terminology facilitated accurate coding but may have predisposed the researcher toward governance-oriented interpretations over purely technical or ethical framings. To manage potential bias, several reflexivity practices were employed. First, a reflexivity journal was maintained throughout the research process, with entries documenting the researcher's evolving interpretations, assumptions, and emotional reactions to the data. Second, 396 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 the two-coder reliability process served as a check on individual bias, as the second coder came from a different disciplinary background (international relations and law). Third, the member checking process with diverse experts helped identify blind spots or overemphases in the coding framework. Fourth, the use of quantitative co-occurrence analysis and centrality scoring provided a systematic, data-driven complement to interpretive coding decisions, reducing reliance on subjective judgment alone. A.5 Data Access and Replication Guide This section provides instructions for accessing the primary data sources and replicating the qualitative coding analysis conducted in Phase 1 of this dissertation. A.5.1 Public Data Access Instructions All data sources used in this dissertation are publicly accessible. The following instructions describe how to retrieve each source category. Congressional Testimony Congressional hearing transcripts and witness testimony can be accessed through Congress.gov (https://www.congress.gov/). Search by committee name (e.g., "Senate Armed Services Committee") and keyword terms including "autonomous weapons," "artificial intelligence," "lethal autonomous weapon systems," "military AI," and "human-AI teaming." Hearing dates and titles are listed in Table A1. Video recordings of many hearings are available through committee websites. GAO and CRS Reports 397 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 GAO reports are freely available at https://www.gao.gov/ using the report numbers listed in Table A2. CRS reports, while not officially published for public access, are available through EveryCRSReport.com (https://www.everycrsreport.com/) and the Federation of American Scientists (https://sgp.fas.org/crs/). Report identifiers are listed in Table A3. Think-Tank and NGO Publications Think-tank publications are available through their respective organizational websites: CNAS (https://www.cnas.org/), RAND (https://www.rand.org/), CSIS (https://www.csis.org/), and Carnegie Endowment (https://carnegieendowment.org/). HRW reports are available at https://www.hrw.org/ and ICRC publications at https://www.icrc.org/. Specific URLs for each publication are maintained in the research database. Weapons Systems Data SIPRI data on autonomous weapons is available through SIPRI publications (https://www.sipri.org/). DoD weapons system specifications are available through official fact files, CRS reports, and the Defense Technical Information Center (DTIC; https://discover.dtic.mil/). DARPA program information is available at https://www.darpa.mil/ and the Assured Autonomy Tools Portal (https://assured-autonomy.org/). A.5.2 Replication Protocol The following protocol describes the steps necessary to replicate the Phase 1 qualitative coding analysis. Step 1: Corpus Assembly. Retrieve all documents listed in Tables A1 through A9 using the source URLs and access instructions provided in Section A.5.1. Import all documents into 398 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 qualitative data analysis software (NVivo 14 or equivalent). Organize documents by source category. Step 2: Coder Training. Familiarize coders with the complete codebook (Table A11), including all code definitions, inclusion criteria, and exclusion criteria. Conduct practice coding sessions on 3-5 documents not included in the reliability subset to calibrate coding practices. Step 3: Open Coding. Read each document in its entirety. Assign codes from the codebook to relevant text segments. Use binary presence/absence coding at the document level (i.e., each code is either present or absent in each document). Record coding decisions in the analytical memo system. Step 4: Reliability Assessment. Select a random subset of at least 20% of the corpus for independent dual coding. Calculate Cohen's kappa for each thematic category and overall. Resolve disagreements through discussion. Step 5: Axial Coding. Calculate co-occurrence matrices, Jaccard similarity coefficients, and PMI scores using the formulas provided below. Identify cross-category and within-category relationships. Step 6: Selective Coding. Calculate centrality scores for each category and identify the core category. Develop the theoretical narrative connecting all categories to the core category. Software Requirements The analysis requires NVivo 14 (or equivalent qualitative data analysis software such as ATLAS.ti or MAXQDA) for document coding and management. Quantitative analysis of cooccurrence matrices and centrality scores was performed using Python 3.11 with the following libraries: pandas (v2.1+), numpy (v1.24+), scipy (v1.11+), matplotlib (v3.8+), and seaborn (v0.13+). Analysis scripts are available from the author upon request. 399 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Quantitative Formulas Jaccard Similarity: J(A,B) = |A ∩ B| / |A ∪ B|, where A and B are the sets of documents containing codes A and B, respectively. Pointwise Mutual Information: PMI(A,B) = log₂[P(A,B) / (P(A) × P(B))], where P(A,B) is the joint probability of codes A and B co-occurring, and P(A) and P(B) are their marginal probabilities. Centrality Score: CS(i) = Σⱼ co-occurrence(i,j) × frequency(i), summed over all categories j connected to category i. 400 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 APPENDIX B: QUANTITATIVE ANALYSIS SUMMARY B.1 Overview This appendix provides the complete statistical outputs and detailed quantitative analysis results that supplement the findings reported in Chapter 4 of this dissertation. The purpose of this appendix is twofold: (a) to ensure full transparency and reproducibility of all quantitative analyses conducted across the three empirical phases of the sequential mixed-methods design, and (b) to provide the level of statistical detail required by methodologists and committee members who may wish to verify analytical decisions, examine distributional properties, or evaluate the robustness of reported findings. The appendix is organized by research phase, mirroring the sequential structure of the dissertation. Section B.2 presents the complete results from Phase 2, the agent-based modeling (ABM) analysis, including Monte Carlo simulation outputs, architecture comparison statistics, response latency distributions, and sensitivity analysis results. Section B.3 provides the full statistical output from Phase 3, the simulation-based experiment, including data screening procedures, MANOVA and univariate ANOVA results, post-hoc comparisons, and effect size summaries. Section B.4 reports the complete results from Phase 4, the tabletop exercise validation, including expert panel demographics, validation ratings, inter-rater reliability analyses, and qualitative feedback coding. Section B.5 integrates findings across all phases through convergence analysis and a research questions evidence matrix. All quantitative analyses were conducted using Python 3.11 with the following statistical libraries: NumPy 1.26 for numerical computation, Pandas 2.1 for data management, SciPy 1.11 for statistical testing, and Statsmodels 0.14 for ANOVA and MANOVA procedures. Visualizations were generated using Matplotlib 3.8 and Seaborn 0.13. The agent-based model 401 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 was implemented in custom Python using Mesa framework conventions. All random number generation used seeded generators to ensure reproducibility. Throughout this appendix, statistical significance is evaluated against a conventional alpha level of α = .05 unless otherwise noted. Effect sizes are interpreted using Cohen’s (1988) benchmarks: for partial eta-squared (η²p), small = .01, medium = .06, large = .14; for Cohen’s d, small = 0.20, medium = 0.50, large = 0.80; for Cohen’s f, small = 0.10, medium = 0.25, large = 0.40. Confidence intervals are reported at the 95% level unless otherwise specified. All p-values are reported to three decimal places, with values below .001 reported as p < .001. B.2 Phase 2: Agent-Based Modeling — Complete Results Phase 2 employed agent-based modeling (ABM) to computationally simulate autonomous weapons employment across three command-and-control (C2) architectures: human-in-the-loop (HITL), human-on-the-loop (HOTL), and human-over-the-loop (HOVL). The ABM executed 13,500 Monte Carlo iterations (1,500 per architecture-condition combination across the 3 × 3 factorial design) to establish baseline performance metrics and identify the fundamental tradeoff space between operational effectiveness and accountability. B.2.1 Model Specification and Parameters The agent-based model was developed following the ODD (Overview, Design concepts, Details) protocol (Grimm et al., 2010) to ensure transparent and replicable model documentation. The model simulated engagement scenarios in which autonomous weapons platforms made targeting decisions under varying levels of human oversight. Table B1 presents the ODD protocol summary. Table B1 402 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 ODD Protocol Summary for the Agent-Based Model Element Purpose Entities State Variables Process Overview Design Concepts Interaction Stochasticity Observation Initialization Scheduling Description Simulate autonomous weapons engagement decisions under three C2 architectures (HITL, HOTL, HOVL) across varying threat conditions Human operators (n = 1–3 per scenario), autonomous weapons platforms (n = 1–5), threat targets, civilian entities, command nodes Decision quality, response latency, ROE adherence, accountability chain integrity, mission success, fatigue level Threat detection → Classification → Authorization decision → Engagement execution → Outcome assessment Bounded rationality, signal detection theory, hierarchical authority, information decay Agents communicate through command channels with architecture-dependent latency and authority constraints Target classification accuracy, response timing, human decision variability (normally distributed) All six performance metrics recorded per engagement per iteration Random threat placement; operator state reset; architecture-specific authority rules activated Event-driven; threats presented at intervals drawn from threat tempo distribution Note. ODD = Overview, Design concepts, Details. Adapted from Grimm et al. (2010). C2 = command and control; HITL = human-in-the-loop; HOTL = human-on-the-loop; HOVL = human-over-the-loop; ROE = rules of engagement. Table B2 presents the complete set of model input parameters with their values, sources, and justifications. Parameters were derived from three primary sources: DoD Directive 3000.09 governance constraints, weapons system performance data compiled from open-source defense publications, and SIPRI autonomous weapons system specifications. Table B2 Agent-Based Model Input Parameters, Values, Sources, and Justifications 403 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Parameter Value Range Source Justification Human decision time (s) 8.0 4.0–15.0 DoDD 3000.09; MIL-STD-1472 System accuracy 0.90 0.80–0.96 DARPA Assured Autonomy Human fatigue rate 0.02 0.005–0.05 Wickens et al. (2015) Threat tempo (events/hr) 10 3–20 SIPRI weapons data Civilian density 0.5 0.1–0.9 Urban operations doctrine Autonomy level 0.5 0.1–0.9 DAM framework ROE strictness 0.85 0.70–1.00 DoDD 3000.09 Iterations per condition 1,500 Fixed Convergence analysis Engagement per iteration 10–40 Variable Threat tempo dist. Accountability decay rate 0.05/level Fixed Theoretical derivation Mean human authorization latency under standard cognitive load Baseline classification accuracy for current-generation autonomous targeting Cognitive degradation rate per hour of continuous monitoring Mean engagement rate across representative operational scenarios Proportion of entities in engagement zone that are civilian Baseline authority delegation parameter (0 = full human, 1 = full autonomous) Compliance threshold for engagement authorization Sufficient for CV < 0.01 on all primary metrics Number of engagement opportunities per scenario iteration Per-level reduction in accountability chain integrity with increasing autonomy Note. DoDD = Department of Defense Directive; SIPRI = Stockholm International Peace Research Institute; DAM = Dynamic Autonomy Management; ROE = rules of engagement; CV = coefficient of variation. 404 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Three agent types were defined in the model. Human operator agents possessed bounded rationality with decision quality modeled as a function of cognitive load, fatigue, and available decision time. Autonomous weapons platform agents executed classification and engagement algorithms with accuracy drawn from a beta distribution centered on the system accuracy parameter. Command node agents maintained the accountability chain and enforced architecturespecific authority rules: HITL required explicit human authorization for every engagement, HOTL permitted system-initiated engagement with a configurable human override window, and HOVL operated under pre-authorized governance parameters with post-hoc human review. B.2.2 Monte Carlo Simulation Results The Monte Carlo simulation generated 13,500 iterations across the full 3 × 3 factorial design (three architectures × three threat conditions). Table B3 presents the complete architecture comparison across all six performance metrics, with descriptive statistics including means, standard deviations, and medians for each architecture. Table B3 Complete Architecture Comparison Across All Performance Metrics (N = 4,500 per Architecture) Metric HITL M HITL SD HITL Mdn HOTL M HOTL SD HOTL Mdn HOVL M HOVL SD HOVL Mdn Decision Quality Response Latency (s) ROE Adherence Accountability Integrity Mission Success Rate 0.790 0.067 0.786 0.900 0.046 0.900 0.915 0.041 0.917 8.51 0.45 8.50 2.70 0.15 2.70 1.20 0.03 1.20 0.910 0.018 0.908 0.868 0.015 0.868 0.820 0.015 0.821 0.978 0.003 0.978 0.863 0.007 0.863 0.681 0.016 0.683 0.716 0.075 0.710 0.863 0.051 0.859 0.893 0.044 0.892 405 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Decision Accuracy (%) 79.03 3.55 79.84 89.86 1.10 89.92 91.49 0.56 91.48 Note. N = 4,500 iterations per architecture (1,500 per threat condition × 3 conditions). HITL = human-in-the-loop; HOTL = human-on-the-loop; HOVL = human-over-the-loop; M = mean; SD = standard deviation; Mdn = median; ROE = rules of engagement. The architecture comparison revealed a consistent pattern of tradeoffs. Decision quality increased monotonically with autonomy level: HITL (M = 0.790, SD = 0.067), HOTL (M = 0.900, SD = 0.046), HOVL (M = 0.915, SD = 0.041). Response latency showed the most dramatic differences, with HITL producing latencies approximately 7× slower than HOVL. Accountability chain integrity exhibited the inverse pattern, declining from 97.8% under HITL to 68.2% under HOVL. Figure B1 presents the architecture comparison bar chart. Figure B1 Performance Metrics Comparison Across Three C2 Architectures 406 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. Error bars represent ± 1 standard deviation. Metrics are displayed on standardized scales for visual comparison. HITL = human-in-the-loop; HOTL = human-on-the-loop; HOVL = human-over-the-loop. Table B4 presents the full factorial results, showing performance metrics for each architecture × threat condition combination. These results reveal important interaction patterns between C2 architecture and operational context. Table B4 Architecture × Threat Condition Interaction: Mean Performance Metrics Architecture Condition Decision Quality Response Latency ROE Adherence Accountability Mission Success Decision Accuracy (%) HITL HITL HITL HOTL HOTL HOTL HOVL HOVL HOVL High Low Medium High Low Medium High Low Medium 0.746 0.826 0.798 0.886 0.913 0.900 0.913 0.913 0.918 8.50 8.53 8.50 2.70 2.70 2.70 1.20 1.20 1.20 0.894 0.924 0.912 0.853 0.881 0.871 0.804 0.832 0.823 0.978 0.978 0.978 0.863 0.863 0.863 0.662 0.698 0.684 0.651 0.770 0.726 0.833 0.889 0.866 0.874 0.903 0.901 74.62 82.54 79.95 88.63 90.94 90.02 91.21 91.54 91.73 Note. Values represent means across 1,500 Monte Carlo iterations per cell. Convergence analysis confirmed that Monte Carlo simulation results stabilized well before the full 1,500 iterations per condition. The coefficient of variation (CV) for all primary metrics dropped below 0.01 by approximately iteration 800, indicating that parameter estimates were robust. The running mean for mission success rate, for example, varied by less than 0.5 percentage points across the final 500 iterations for all architecture-condition combinations. This convergence behavior was consistent across all six performance metrics, confirming the adequacy of the 1,500-iteration design for achieving stable estimates. Figure B2 407 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Architecture × Threat Condition Grouped Bar Charts for All Performance Metrics Note. Grouped bars show mean performance within each architecture–threat condition cell. Error bars represent ± 1 SD. B.2.3 Response Latency Distributions Response latency distributions differed markedly across C2 architectures, reflecting the fundamental speed-oversight tradeoff. Table B5 presents detailed distributional statistics for each architecture, including measures of central tendency, dispersion, and shape. Table B5 Detailed Response Latency Distributional Statistics by C2 Architecture Statistic Mean (s) Median (s) SD Skewness HITL HOTL HOVL 8.510 8.502 0.450 0.030 2.701 2.699 0.150 0.216 1.200 1.199 0.030 0.424 408 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Kurtosis Min (s) Max (s) IQR (s) 1.839 5.773 10.630 0.519 1.344 2.133 3.472 0.170 1.886 1.070 1.362 0.034 Note. Statistics computed across all Monte Carlo iterations per architecture (N = 4,500 per architecture). IQR = interquartile range. Figure B3 Response Latency Distributions by C2 Architecture and Threat Condition Note. Histograms show the distribution of mean response latencies across Monte Carlo iterations. Vertical dashed lines indicate distribution means. Normality testing using the Shapiro-Wilk test revealed that response latency distributions departed significantly from normality for all three architectures. For HITL, W = .996, p = .014; for HOTL, W = .997, p = .028; for HOVL, W = .998, p = .041. Given these departures from normality, non-parametric Kruskal-Wallis tests were employed to supplement parametric comparisons. The Kruskal-Wallis test confirmed significant differences among architectures, H(2) = 12,847.62, p < .001, consistent with the parametric findings. Figure B4 Response Latency Box Plots by C2 Architecture and Threat Condition 409 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. Box plots display median (center line), interquartile range (box), and 1.5 × IQR whiskers. Individual outliers shown as dots. B.2.4 Sensitivity Analysis — Complete Results One-at-a-time (OAT) sensitivity analysis was conducted on six key model parameters to assess the robustness of simulation results and identify critical parameters requiring careful calibration. Each parameter was varied between its low and high boundary values while holding all other parameters at their baseline settings. Table B6 presents the complete sensitivity analysis results. Table B6 Sensitivity Analysis Results: Parameter Effects on Mission Success Rate Parameter Baseline Low Value High Value MS at Low MS at High Δ (Low) Δ (High) Sensitivity Index Human Decision Time (s) System Accuracy 8.0 4.00 15.00 0.8928 0.8292 0.0322 -0.0314 0.0636 0.90 0.80 0.96 0.8020 0.8994 -0.0586 0.0388 0.0974 410 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Human Fatigue Rate Threat Tempo (events/hr) Civilian Density Autonomy Level 0.02 0.01 0.05 0.8620 0.8752 0.0014 0.0146 0.0132 10 3.00 20.00 0.8712 0.8698 0.0106 0.0092 0.0014 0.5 0.10 0.90 0.8550 0.8628 -0.0056 0.0022 0.0078 0.5 0.10 0.90 0.8730 0.8628 0.0124 0.0022 0.0102 Note. MS = mission success rate. Sensitivity index = |MS_high − MS_low|. Higher values indicate greater parameter influence. Baseline mission success rate (averaged across architectures and conditions) = .8606. Figure B5 Tornado Diagram Showing Parameter Sensitivity on Mission Success Rate Note. Bars represent deviation from baseline mission success rate when each parameter is set to its low (left) and high (right) boundary values. Parameters ordered by total sensitivity index. System accuracy emerged as the most influential parameter, with a total sensitivity index of 0.0974, indicating a swing of approximately 9.7 percentage points in mission success between 411 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 the low (0.80) and high (0.96) accuracy conditions. Human decision time was the second most influential parameter, reflecting the importance of operator response speed in time-constrained engagement scenarios. The remaining parameters—human fatigue rate, threat tempo, civilian density, and autonomy level—showed smaller but non-negligible effects, each producing changes of less than 2 percentage points. These sensitivity results identify two critical tipping points. First, when system accuracy drops below 0.85, mission success degrades rapidly, suggesting a minimum performance threshold for autonomous targeting algorithms. Second, when human decision time exceeds 12 seconds, the latency penalty under HITL becomes prohibitive for time-critical engagements, reinforcing the need for dynamic autonomy transitions to HOTL or HOVL under high-tempo conditions. B.2.5 Model Validation Model validation employed four complementary approaches. Face validity was assessed by comparing the model’s behavioral patterns against known characteristics of humanautonomous system interaction. The monotonic decrease in response latency with increasing autonomy (HITL > HOTL > HOVL) and the inverse relationship between autonomy and accountability chain integrity are both consistent with established findings in the human factors and autonomous systems literatures (Parasuraman et al., 2000; Sheridan & Verplank, 1978). Comparison with known real-world system performance data provided partial external validation. The HITL response latency of 8.51 seconds is consistent with reported human-in-theloop engagement times for Patriot missile defense (8–12 seconds; Congressional Research Service, 2020) and Aegis combat system operator response times (6–15 seconds; GAO, 2018). 412 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 The HOVL latency of 1.20 seconds falls within the range of reported autonomous engagement times for close-in weapons systems (0.5–2.0 seconds; SIPRI, 2017). Sensitivity to initial conditions was assessed by running 100 replications with different random seeds for a subset of architecture-condition combinations. The coefficient of variation across replications was less than 0.5% for all primary metrics, indicating that results are robust to stochastic variability in initialization. Robustness checks included varying the engagement count per iteration (±50% of baseline) and altering the distribution family for human decision time (normal, log-normal, gamma). In all cases, the rank ordering of architectures on all metrics remained unchanged, and absolute differences in means were within 2% of baseline values. Figure B6 Mission Success Rate Across Threat Conditions by C2 Architecture 413 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. Lines connect architecture means across low, medium, and high threat conditions. Shaded regions represent ± 1 SD. B.3 Phase 3: Simulation-Based Experiment — Complete Statistical Output Phase 3 employed a 3 × 3 between-subjects factorial design with autonomy level (HITL, HOTL, HOVL) and threat tempo (Low, Medium, High) as independent variables and five dependent variables: decision accuracy, response time, trust score, cognitive load (NASA-TLX), and ROE compliance. This section presents the complete statistical output from all analyses. B.3.1 Data Screening and Assumptions Testing The experimental dataset comprised N = 118 simulated participants distributed across the nine cells of the 3 × 3 factorial design. Cell sizes ranged from n = 13 to n = 14, with a slight imbalance in the HITL–Low condition (n = 14) compared to all other cells (n = 13). This minor imbalance does not substantively affect the validity of the factorial ANOVA, particularly given the use of Type III sums of squares which are robust to unequal cell sizes. Table B7 Sample Sizes per Cell in the 3 × 3 Factorial Design Autonomy Level Low Tempo Medium Tempo High Tempo Row Total 14 13 13 40 13 13 13 39 13 13 13 39 40 39 39 118 HITL HOTL HOVL Column Total Note. N = 118 total participants. Missing data analysis revealed no missing values across any dependent variable for any participant. The complete data matrix (118 participants × 5 DVs = 590 data points) contained zero missing entries, reflecting the controlled nature of the simulation-based experimental paradigm. 414 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Outlier detection was conducted using Mahalanobis distance calculated across all five dependent variables simultaneously. Using a chi-squared critical value of χ²(5) = 20.52 at α = .001, no participants exceeded the critical threshold, indicating no multivariate outliers. Univariate outlier screening using ±3.29 standard deviations from the cell mean identified no extreme values for any DV in any cell, confirming data quality. Table B8 Assumption Testing Results for MANOVA and Factorial ANOVA Test DV / Assumption Statistic df p Decision Shapiro-Wilk Decision Accuracy Response Time Trust Score Cognitive Load ROE Compliance Decision Accuracy Response Time Trust Score Cognitive Load ROE Compliance Multivariate W = .984 118 .163 Normal W = .961 W = .978 W = .972 W = .981 118 118 118 118 .002 .051 .017 .098 Non-normal* Normal Non-normal* Normal F = 1.87 8, 109 .072 Equal variance F = 2.34 F = 1.12 F = 1.45 F = 0.98 8, 109 8, 109 8, 109 8, 109 .024 .354 .183 .456 Unequal* Equal variance Equal variance Equal variance M = 142.36 — .008 Proceed with caution† Shapiro-Wilk Shapiro-Wilk Shapiro-Wilk Shapiro-Wilk Levene's Levene's Levene's Levene's Levene's Box's M Note. *Non-normality is mild and ANOVA is robust to moderate departures with balanced designs (Glass et al., 1972). †Box’s M is sensitive to non-normality; Pillai’s Trace is used as the most robust MANOVA test statistic (Olson, 1974). B.3.2 Descriptive Statistics — Complete Tables Table B9 presents the full descriptive statistics for all five dependent variables across all nine cells of the 3 × 3 factorial design. This table provides the foundation for all subsequent inferential analyses and allows direct inspection of cell means and variability patterns. 415 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table B9 Descriptive Statistics for All Dependent Variables by Autonomy Level × Threat Tempo (N = 118) Decision Accuracy Condition N M SD Min Max HITL × Low HITL × Medium HITL × High HOTL × Low HOTL × Medium HOTL × High HOVL × Low HOVL × Medium HOVL × High 14 13 85.54 78.38 4.73 6.02 — — — — 13 13 13 71.58 87.26 85.15 8.03 5.42 6.11 — — — — — — 13 13 13 78.49 89.38 85.92 12.69 4.23 6.12 — — — — — — 13 81.78 4.92 — — Response Time (s) Condition N M SD Min Max HITL × Low HITL × Medium HITL × High HOTL × Low HOTL × Medium HOTL × High HOVL × Low HOVL × Medium HOVL × High 14 13 12.77 10.81 3.95 4.15 — — — — 13 13 13 8.53 5.22 4.38 2.95 2.15 1.26 — — — — — — 13 13 13 3.72 1.86 1.70 1.31 0.82 0.58 — — — — — — 13 1.69 0.45 — — Trust Score Condition N M SD Min Max HITL × Low HITL × Medium 14 13 5.56 5.41 0.69 0.80 — — — — 416 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 HITL × High HOTL × Low HOTL × Medium HOTL × High HOVL × Low HOVL × Medium HOVL × High 13 13 13 5.30 4.69 4.81 0.96 0.73 0.70 — — — — — — 13 13 13 4.21 4.28 3.38 1.04 1.09 0.80 — — — — — — 13 3.67 1.27 — — Condition N M SD Min Max HITL × Low HITL × Medium HITL × High HOTL × Low HOTL × Medium HOTL × High HOVL × Low HOVL × Medium HOVL × High 14 13 32.42 52.15 8.11 6.38 — — — — 13 13 13 76.58 26.48 33.95 11.33 6.14 11.50 — — — — — — 13 13 13 61.96 20.60 27.32 12.04 9.72 11.09 — — — — — — 13 41.22 12.86 — — Condition N M SD Min Max HITL × Low HITL × Medium HITL × High HOTL × Low HOTL × Medium HOTL × High HOVL × Low HOVL × Medium HOVL × High 14 13 96.15 91.73 2.36 4.37 — — — — 13 13 13 86.80 91.86 85.98 4.98 4.45 6.74 — — — — — — 13 13 13 82.44 88.75 87.52 5.43 4.96 7.07 — — — — — — 13 79.18 7.21 — — Cognitive Load ROE Compliance Note. M = mean; SD = standard deviation. HITL = human-in-the-loop; HOTL = human-on-the-loop; HOVL = human-over-the-loop. 417 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table B10 presents the marginal means by autonomy level, collapsed across threat tempo conditions. These marginal statistics isolate the main effect of autonomy level on each dependent variable. Table B10 Marginal Descriptive Statistics by Autonomy Level (Collapsed Across Threat Tempo) Autonomy Level HITL HITL HITL HITL HITL HOTL HOTL HOTL HOTL HOTL HOVL HOVL HOVL HOVL HOVL DV M SD Min Max Mdn Decision Accuracy Response Time (s) Trust Score Cognitive Load ROE Compliance Decision Accuracy Response Time (s) Trust Score Cognitive Load ROE Compliance Decision Accuracy Response Time (s) Trust Score Cognitive Load ROE Compliance 78.68 8.49 58.3 92.6 79.85 10.76 4.03 1.7 17.4 10.50 5.42 53.18 0.80 20.28 4.0 20.1 7.0 99.1 5.50 50.70 91.68 5.52 76.6 99.6 92.60 83.64 9.28 56.7 96.4 85.20 4.44 1.70 1.7 9.3 4.11 4.57 40.80 0.86 18.41 2.0 4.4 6.2 79.9 4.70 37.80 86.76 6.74 74.3 100.0 88.00 85.69 5.92 75.1 97.6 86.60 1.75 0.62 0.5 3.3 1.79 3.77 29.72 1.11 14.02 1.6 6.8 7.0 63.4 3.90 29.30 85.15 7.65 66.3 100.0 84.70 Note. Values collapsed across all three threat tempo conditions. Table B11 presents the marginal means by threat tempo, collapsed across autonomy level conditions. 418 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table B11 Marginal Descriptive Statistics by Threat Tempo (Collapsed Across Autonomy Level) Threat Tempo Low Low Low Low Low Medium Medium Medium Medium Medium High High High High High DV M SD Min Max Mdn Decision Accuracy Response Time (s) Trust Score Cognitive Load ROE Compliance Decision Accuracy Response Time (s) Trust Score Cognitive Load ROE Compliance Decision Accuracy Response Time (s) Trust Score Cognitive Load ROE Compliance 87.34 4.95 76.7 97.6 87.60 6.77 5.35 0.9 17.4 5.03 4.86 26.65 0.99 9.31 2.5 7.7 7.0 45.2 4.80 28.00 92.35 5.01 81.5 99.6 93.10 83.15 6.84 67.3 95.0 82.20 5.63 4.59 0.5 17.2 3.87 4.53 37.81 1.14 14.37 2.0 4.4 6.9 64.0 4.60 39.10 88.41 6.50 74.3 100.0 89.90 77.28 9.87 56.7 96.4 79.00 4.65 3.43 0.8 12.0 3.22 4.39 59.92 1.27 18.83 1.6 20.6 7.0 99.1 4.50 61.90 82.81 6.60 66.3 94.5 83.30 B.3.3 Correlation Matrix Table B12 presents the Pearson correlation matrix among the five dependent variables. The correlation structure informs the interpretation of MANOVA results and identifies potential multicollinearity concerns. Table B12 Pearson Correlation Matrix Among the Five Dependent Variables (N = 118) 419 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Variable 1 1. Decision Accuracy 2. Response Time (s) 3. Trust Score — 4. Cognitive Load 5. ROE Compliance 2 3 4 -0.174 — -0.227* 0.441*** — -0.466*** 0.249** 0.146 — 0.101 0.466*** 0.258** -0.166 5 — Note. *p < .05. **p < .01. ***p < .001. Correlations computed across all 118 participants. The correlation matrix revealed several expected relationships. Response time and cognitive load showed a strong positive correlation, reflecting the association between longer decision processes and higher cognitive demands. Decision accuracy and ROE compliance were positively correlated, indicating that accurate engagement decisions tend to comply with rules of engagement. Trust score showed modest positive correlations with decision accuracy, suggesting that trust is partially calibrated to actual system performance. Importantly, no correlations exceeded .90, indicating that multicollinearity does not threaten the validity of the MANOVA analysis (Tabachnick & Fidell, 2019). B.3.4 MANOVA Results — Full Output A two-way multivariate analysis of variance (MANOVA) was conducted to examine the simultaneous effects of autonomy level and threat tempo on the five dependent variables. Table B13 presents the complete multivariate test results for all effects. Table B13 Multivariate Test Results for Effects of Autonomy Level, Threat Tempo, and Their Interaction Effect Test Value F Hyp. df Error df p η²p Autonomy Level Autonomy Wilks' Λ 0.344 14.829 10 210 < .001 .414 Pillai's 0.681 10.938 10 212 < .001 .340 420 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Level Autonomy Level Autonomy Level Threat Tempo Threat Tempo Threat Tempo Threat Tempo Autonomy × Tempo Autonomy × Tempo Autonomy × Tempo Autonomy × Tempo Trace HotellingLawley Roy's Largest Root Wilks' Λ 1.840 19.203 10 154.78 < .001 .480 1.801 38.184 5 106 < .001 .643 0.317 16.311 10 210 < .001 .437 Pillai's Trace HotellingLawley Roy's Largest Root Wilks' Λ 0.684 11.022 10 212 < .001 .342 2.154 22.477 10 154.78 < .001 .503 2.153 45.637 5 106 < .001 .683 0.679 2.160 20 349.20 .003 .092 Pillai's Trace HotellingLawley Roy's Largest Root 0.345 2.036 20 432 .005 .086 0.438 2.276 20 223.67 .002 .098 0.344 7.425 5 108 < .001 .256 Note. Λ = Wilks' Lambda. Partial eta-squared (η²p) computed from Pillai's Trace for main effects and interaction. Roy's Largest Root is reported for completeness but should be interpreted with caution in the presence of multiple dependent variables. The MANOVA revealed statistically significant multivariate effects for all three sources of variance. The main effect of autonomy level was significant, Pillai’s Trace = 0.681, F(10, 212) = 10.94, p < .001, η²p = .34. The main effect of threat tempo was also significant, Pillai’s Trace = 0.684, F(10, 212) = 11.02, p < .001, η²p = .34. The interaction between autonomy level and threat tempo was significant, Pillai’s Trace = 0.345, F(20, 432) = 2.04, p = .005, η²p = .09. Pillai’s Trace was selected as the primary test statistic due to its robustness to violations of homogeneity of covariance matrices (Olson, 1974), which was indicated by the significant Box’s M test. B.3.5 Univariate ANOVA Results — Complete Tables for Each DV 421 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 B.3.5.1 Decision Accuracy Table B14 presents the two-way ANOVA summary table for decision accuracy. This analysis tests the main effects of autonomy level and threat tempo, as well as their interaction, on decision accuracy scores. Table B14 Two-Way ANOVA Summary Table for Decision Accuracy Source SS df MS F p η²p Autonomy Level Threat Tempo Autonomy × Tempo Residual (Error) Total 1067.71 2 533.86 11.223 < .001 0.1708 2052.44 2 1026.22 21.574 < .001 0.2836 182.39 4 45.60 0.959 .433 0.0340 5184.84 109 47.57 — — — 8487.39 117 — — — — Note. SS = sum of squares; df = degrees of freedom; MS = mean square; η²p = partial eta-squared. The main effect of autonomy level on decision accuracy was significant, F(2, 109) = 11.22, p < .001, η²p = 0.17. The main effect of threat tempo was significant, F(2, 109) = 21.57, p < .001, η²p = 0.28. The interaction between autonomy level and threat tempo was not significant, F(4, 109) = 0.96, p 0.433, η²p = 0.03. Table B15 presents the estimated marginal means for decision accuracy by autonomy level and threat tempo. Table B15 Estimated Marginal Means for Decision Accuracy by Condition Condition HITL HOTL M SD N 78.68 83.64 8.49 9.28 40 39 422 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 HOVL Low Medium High 85.69 87.34 83.15 77.28 5.92 4.95 6.84 9.87 39 40 39 39 B.3.5.2 Response Time Table B16 presents the two-way ANOVA summary table for response time. This analysis tests the main effects of autonomy level and threat tempo, as well as their interaction, on response time scores. Table B16 Two-Way ANOVA Summary Table for Response Time Source SS df MS F p η²p Autonomy Level Threat Tempo Autonomy × Tempo Residual (Error) Total 1683.19 2 841.59 147.208 < .001 0.7298 78.86 2 39.43 6.897 .002 0.1123 57.72 4 14.43 2.524 .045 0.0848 623.16 109 5.72 — — — 2442.92 117 — — — — Note. SS = sum of squares; df = degrees of freedom; MS = mean square; η²p = partial eta-squared. The main effect of autonomy level on response time was significant, F(2, 109) = 147.21, p < .001, η²p = 0.73. The main effect of threat tempo was significant, F(2, 109) = 6.90, p = .002, η²p = 0.11. The interaction between autonomy level and threat tempo was significant, F(4, 109) = 2.52, p 0.045, η²p = 0.08. Table B17 presents the estimated marginal means for response time by autonomy level and threat tempo. Table B17 423 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Estimated Marginal Means for Response Time by Condition Condition HITL HOTL HOVL Low Medium High M SD N 10.76 4.44 1.75 6.77 5.63 4.65 4.03 1.70 0.62 5.35 4.59 3.43 40 39 39 40 39 39 Given the significant interaction effect, simple effects analyses were conducted. The effect of autonomy level on response time was examined at each level of threat tempo. Under high threat tempo, differences among autonomy levels were most pronounced, reflecting the divergent performance of HITL, HOTL, and HOVL architectures under time pressure. Under low threat tempo, differences were attenuated as all architectures had sufficient time to achieve adequate performance. B.3.5.3 Trust Score Table B18 presents the two-way ANOVA summary table for trust score. This analysis tests the main effects of autonomy level and threat tempo, as well as their interaction, on trust score scores. Table B18 Two-Way ANOVA Summary Table for Trust Score Source SS df MS F p η²p Autonomy Level Threat Tempo Autonomy × Tempo Residual (Error) Total 53.42 2 26.71 31.907 < .001 0.3693 4.14 2 2.07 2.475 .089 0.0434 4.42 4 1.11 1.321 .267 0.0462 91.25 109 0.84 — — — 153.23 117 — — — — 424 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. SS = sum of squares; df = degrees of freedom; MS = mean square; η²p = partial eta-squared. The main effect of autonomy level on trust score was significant, F(2, 109) = 31.91, p < .001, η²p = 0.37. The main effect of threat tempo was not significant, F(2, 109) = 2.48, p 0.089, η²p = 0.04. The interaction between autonomy level and threat tempo was not significant, F(4, 109) = 1.32, p 0.267, η²p = 0.05. Table B19 presents the estimated marginal means for trust score by autonomy level and threat tempo. Table B19 Estimated Marginal Means for Trust Score by Condition Condition HITL HOTL HOVL Low Medium High M SD N 5.42 4.57 3.77 4.86 4.53 4.39 0.80 0.86 1.11 0.99 1.14 1.27 40 39 39 40 39 39 B.3.5.4 Cognitive Load (NASA-TLX) Table B20 presents the two-way ANOVA summary table for cognitive load (nasa-tlx). This analysis tests the main effects of autonomy level and threat tempo, as well as their interaction, on cognitive load (nasa-tlx) scores. Table B20 Two-Way ANOVA Summary Table for Cognitive Load (NASA-TLX) 425 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Source SS df MS F p η²p Autonomy Level Threat Tempo Autonomy × Tempo Residual (Error) Total 11250.93 2 5625.47 54.476 < .001 0.4999 22942.03 2 11471.01 111.082 < .001 0.6709 2194.85 4 548.71 5.314 < .001 0.1632 11255.97 109 103.27 — — — 47643.79 117 — — — — Note. SS = sum of squares; df = degrees of freedom; MS = mean square; η²p = partial eta-squared. The main effect of autonomy level on cognitive load (nasa-tlx) was significant, F(2, 109) = 54.48, p < .001, η²p = 0.50. The main effect of threat tempo was significant, F(2, 109) = 111.08, p < .001, η²p = 0.67. The interaction between autonomy level and threat tempo was significant, F(4, 109) = 5.31, p < .001, η²p = 0.16. Table B21 presents the estimated marginal means for cognitive load (nasa-tlx) by autonomy level and threat tempo. Table B21 Estimated Marginal Means for Cognitive Load (NASA-TLX) by Condition Condition HITL HOTL HOVL Low Medium High M SD N 53.18 40.80 29.72 26.65 37.81 59.92 20.28 18.41 14.02 9.31 14.37 18.83 40 39 39 40 39 39 Given the significant interaction effect, simple effects analyses were conducted. The effect of autonomy level on cognitive load (nasa-tlx) was examined at each level of threat tempo. Under high threat tempo, differences among autonomy levels were most pronounced, reflecting the divergent performance of HITL, HOTL, and HOVL architectures under time pressure. Under 426 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 low threat tempo, differences were attenuated as all architectures had sufficient time to achieve adequate performance. B.3.5.5 ROE Compliance Table B22 presents the two-way ANOVA summary table for roe compliance. This analysis tests the main effects of autonomy level and threat tempo, as well as their interaction, on roe compliance scores. Table B22 Two-Way ANOVA Summary Table for ROE Compliance Source SS df MS F p η²p Autonomy Level Threat Tempo Autonomy × Tempo Residual (Error) Total 882.02 2 441.01 14.766 < .001 0.2132 1780.93 2 890.47 29.814 < .001 0.3536 101.70 4 25.42 0.851 .496 0.0303 3255.51 109 29.87 — — — 6020.17 117 — — — — Note. SS = sum of squares; df = degrees of freedom; MS = mean square; η²p = partial eta-squared. The main effect of autonomy level on roe compliance was significant, F(2, 109) = 14.77, p < .001, η²p = 0.21. The main effect of threat tempo was significant, F(2, 109) = 29.81, p < .001, η²p = 0.35. The interaction between autonomy level and threat tempo was not significant, F(4, 109) = 0.85, p 0.496, η²p = 0.03. Table B23 presents the estimated marginal means for roe compliance by autonomy level and threat tempo. Table B23 427 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Estimated Marginal Means for ROE Compliance by Condition Condition HITL HOTL HOVL Low Medium High M SD N 91.68 86.76 85.15 92.35 88.41 82.81 5.52 6.74 7.65 5.01 6.50 6.60 40 39 39 40 39 39 Figure B7 Interaction Plots for All Five Dependent Variables (Autonomy Level × Threat Tempo) Note. Lines connect cell means for each autonomy level across threat tempo conditions. Error bars represent ± 1 SE. Non-parallel lines suggest interaction effects. Figure B8 Grouped Bar Charts Comparing Cell Means for All Dependent Variables 428 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. Error bars represent ± 1 SE of the cell mean. Figure B9 Distribution Histograms for All Dependent Variables by Autonomy Level 429 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. Distributions separated by autonomy level (HITL, HOTL, HOVL). Vertical dashed lines indicate group means. B.3.6 Post-Hoc Comparisons — Complete Tukey HSD Tables Tukey’s Honestly Significant Difference (HSD) procedure was employed for all post-hoc pairwise comparisons following significant ANOVA effects. This procedure controls the familywise error rate at α = .05 across all pairwise comparisons within each factor. Table B24 presents the complete Tukey HSD results for all dependent variables and factors. Table B24 Tukey HSD Post-Hoc Pairwise Comparisons for Autonomy Level DV Comparison Mean Diff p (adj) 95% CI [LL, UL] Significant Cohen's d Decision Accuracy Decision Accuracy Decision HITL vs. HOTL HITL vs. HOVL HOTL vs. 4.958 .019 [0.67, 9.25] Yes 0.583 7.012 < .001 [2.72, 11.30] Yes 0.825 2.054 .498 [-2.27, 6.37] No 0.242 430 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Accuracy Response Time Response Time Response Time Trust Score Trust Score Trust Score Cognitive Load Cognitive Load Cognitive Load Roe Compliance Roe Compliance Roe Compliance HOVL HITL vs. HOTL HITL vs. HOVL HOTL vs. HOVL HITL vs. HOTL HITL vs. HOVL HOTL vs. HOVL HITL vs. HOTL HITL vs. HOVL HOTL vs. HOVL HITL vs. HOTL HITL vs. HOVL HOTL vs. HOVL -6.319 < .001 [-7.69, -4.95] Yes -1.380 -9.008 < .001 Yes -1.967 -2.689 < .001 [-10.38, 7.63] [-4.07, -1.31] Yes -0.587 -0.856 < .001 [-1.35, -0.36] Yes -0.747 -1.651 < .001 [-2.15, -1.15] Yes -1.441 -0.795 < .001 [-1.30, -0.29] Yes -0.694 -12.388 .007 Yes -0.616 -23.470 < .001 Yes -1.167 -11.082 .019 Yes -0.551 -4.916 .004 [-21.89, 2.88] [-32.98, 13.96] [-20.65, 1.52] [-8.49, -1.34] Yes -0.683 -6.524 < .001 Yes -0.907 -1.608 .539 [-10.10, 2.95] [-5.20, 1.99] No -0.224 Note. Mean Diff = mean of group 1 minus mean of group 2. p (adj) = Tukey-adjusted p-value. CI = confidence interval. Cohen’s d estimated from mean difference divided by pooled SD. Table B25 Tukey HSD Post-Hoc Pairwise Comparisons for Threat Tempo DV Comparison Mean Diff p (adj) 95% CI [LL, UL] Significant Cohen's d Decision Accuracy Decision Accuracy Decision Accuracy Response Time Response Time Response Time High vs. Low 10.060 < .001 [6.06, 14.06] Yes 1.184 High vs. Medium Low vs. Medium High vs. Low 5.867 .002 [1.84, 9.89] Yes 0.690 -4.194 .037 [-8.19, -0.20] Yes -0.494 2.126 .098 [-0.30, 4.55] No 0.464 0.984 .605 [-1.45, 3.42] No 0.215 -1.142 .504 [-3.56, 1.28] No -0.249 High vs. Medium Low vs. Medium 431 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Trust Score Trust Score Trust Score Cognitive Load Cognitive Load Cognitive Load Roe Compliance Roe Compliance Roe Compliance High vs. Low High vs. Medium Low vs. Medium High vs. Low 0.468 0.139 .166 .853 [-0.14, 1.08] [-0.47, 0.75] No No 0.408 0.121 -0.329 .406 [-0.94, 0.28] No -0.287 -33.273 < .001 Yes -1.655 High vs. Medium Low vs. Medium High vs. Low -22.110 < .001 Yes -1.100 11.163 .003 [-41.10, 25.44] [-29.99, 14.23] [3.33, 18.99] Yes 0.555 9.545 < .001 [6.30, 12.79] Yes 1.327 High vs. Medium Low vs. Medium 5.600 < .001 [2.33, 8.87] Yes 0.779 -3.945 .013 [-7.19, -0.70] Yes -0.548 Note. Mean Diff = mean of group 1 minus mean of group 2. p (adj) = Tukey-adjusted p-value. CI = confidence interval. Cohen’s d estimated from mean difference divided by pooled SD. The post-hoc analyses revealed several important pairwise differences. For autonomy level, HITL differed significantly from both HOTL and HOVL on decision accuracy, response time, trust score, cognitive load, and ROE compliance (all ps < .05). HOTL and HOVL differed significantly on response time and trust score but not on decision accuracy or ROE compliance, suggesting that the primary performance distinction is between HITL (full human control) and the two higher-autonomy architectures. For threat tempo, high tempo differed significantly from both low and medium tempo on decision accuracy, cognitive load, and ROE compliance (all ps < .05). Low and medium tempo also differed significantly on these variables, establishing a monotonic gradient in which increasing threat tempo degrades performance and compliance while increasing cognitive load. B.3.7 Effect Size Summary 432 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table B26 presents a comprehensive summary of all effect sizes across all tests conducted in the Phase 3 analysis. This table facilitates rapid comparison of the relative magnitude of each effect and supports power analysis for future research. Table B26 Comprehensive Effect Size Summary for All Phase 3 Statistical Tests Effect DV η²p Cohen's f Interpretation Observed Power Autonomy Level Threat Tempo Decision Accuracy Decision Accuracy Decision Accuracy 0.1708 0.454 Large > .99 0.2836 0.629 Large > .99 0.0340 0.188 Small < .70 Response Time 0.7298 1.643 Large > .99 Response Time Response Time 0.1123 0.0848 0.356 0.304 Medium Medium .95–.99 .70–.95 Trust Score 0.3693 0.765 Large > .99 Trust Score Trust Score 0.0434 0.0462 0.213 0.220 Small Small < .70 < .70 Cognitive Load 0.4999 1.000 Large > .99 Cognitive Load Cognitive Load 0.6709 0.1632 1.428 0.442 Large Large > .99 > .99 Roe Compliance 0.2132 0.521 Large > .99 Roe Compliance Roe Compliance 0.3536 0.0303 0.740 0.177 Large Small > .99 < .70 Autonomy Level × Threat Tempo Autonomy Level Threat Tempo Autonomy Level × Threat Tempo Autonomy Level Threat Tempo Autonomy Level × Threat Tempo Autonomy Level Threat Tempo Autonomy Level × Threat Tempo Autonomy Level Threat Tempo Autonomy Level × Threat Tempo Note. η²p = partial eta-squared. Cohen’s f = √(η²p / (1 − η²p)). Effect size interpretation follows Cohen (1988): small (η²p = .01), medium (η²p = .06), large (η²p = .14). Observed power estimated from effect size and sample size. 433 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Figure B10 Partial Eta-Squared Effect Sizes for All Dependent Variables and Factors Note. Bars represent partial eta-squared (η²p) values. Horizontal reference lines at .01 (small), .06 (medium), and .14 (large) following Cohen's (1988) benchmarks. The effect size analysis reveals a clear hierarchy of influences. Autonomy level exerted its largest effect on response time (η²p = .730, large), followed by cognitive load (η²p = .500, large), trust score (η²p = .369, large), ROE compliance (η²p = .213, large), and decision accuracy (η²p = .171, large). Threat tempo exerted its largest effect on cognitive load (η²p = .671, large), followed by ROE compliance (η²p = .354, large), decision accuracy (η²p = .284, large), response time (η²p = .112, medium), and trust score (η²p = .043, small, non-significant). The interaction effects were generally smaller, with the largest interaction on cognitive load (η²p = .163, large) and response time (η²p = .085, medium). B.4 Phase 4: Tabletop Exercise Validation — Complete Results Phase 4 employed a structured tabletop exercise with a panel of 18 defense professionals to validate the Dynamic Autonomy Management (DAM) framework. Experts evaluated the 434 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 framework on five criteria: Feasibility, Doctrinal Compatibility, Traceability, MHC Preservation, and Scalability. This section presents the complete validation results. B.4.1 Expert Panel Demographics Table B27 presents the expert panel composition by role, service branch, years of experience, and specialty area. The panel was purposively selected to ensure representation across military services, civilian defense agencies, defense industry, and academic/policy research organizations. Table B27 Expert Panel Composition and Demographic Characteristics (N = 18) Expert Role Branch/Org Years Exp. Specialty Army 25 C2 Systems Navy 22 Naval Warfare Air Force 18 Marine Corps 20 Autonomous Systems Ground Combat OSD DARPA 28 15 AI Policy Autonomy R&D Industry 20 Weapons Systems Industry 17 AI/ML Integration CNAS 12 RAND 10 NPS West Point HASC 22 15 8 Defense Technology Military Operations Research Human Factors Military Ethics Defense Oversight E14 Senior Military Officer (O-6) Senior Military Officer (O-6) Senior Military Officer (O-5) Senior Military Officer (O-5) DoD Civilian (SES) DoD Civilian (GS15) Defense Industry Engineer Defense Industry Program Manager Think Tank Senior Fellow Think Tank Researcher Academic Professor Academic Professor Congressional Staff (PSM) JAG Officer (O-5) Army JAG 16 E15 Intelligence Officer DIA 14 E01 E02 E03 E04 E05 E06 E07 E08 E09 E10 E11 E12 E13 435 Law of Armed Conflict AI/ISR DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 (O-4) Test & Evaluation Director Cyber Operations Officer (O-4) Special Operations Officer (O-5) E16 E17 E18 DOT&E 19 System Evaluation CYBERCOM 11 Cyber Defense SOCOM 21 Special Ops C2 Note. SES = Senior Executive Service; GS = General Schedule; HASC = House Armed Services Committee; JAG = Judge Advocate General; DIA = Defense Intelligence Agency; DOT&E = Director of Operational Test and Evaluation; NPS = Naval Postgraduate School. The panel averaged 17.4 years of professional experience (SD = 5.3, range = 8–28 years). By role category, the panel included six senior military officers (O-4 through O-6), three DoD civilians (including one Senior Executive Service member), two defense industry professionals, two think tank researchers/fellows, two academic professors, one congressional staff member, one JAG officer, and one test and evaluation director. This composition ensured that the framework was evaluated from operational, policy, legal, technical, and academic perspectives. B.4.2 Validation Ratings — Complete Descriptive Statistics Table B28 presents the full descriptive statistics for all five validation criteria rated on a 7-point Likert scale (1 = Strongly Disagree to 7 = Strongly Agree). Table B28 Descriptive Statistics for Expert Validation Ratings on Five Criteria (N = 18) Criterion N M SD SE Mdn Min Max Skewness Kurtosis Feasibility Doctrinal Compatibility Traceability MHC Preservation Scalability 18 18 5.17 5.50 0.99 0.79 0.23 0.19 5.0 6.0 3 4 7 7 -0.338 -0.374 -0.237 -0.367 18 18 5.83 5.56 0.62 1.20 0.15 0.28 6.0 5.0 5 4 7 7 0.085 0.076 -0.391 -1.488 18 4.72 1.18 0.28 5.0 3 6 -0.331 -1.320 436 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. Ratings on a 7-point Likert scale (1 = Strongly Disagree, 4 = Neutral, 7 = Strongly Agree). SE = standard error of the mean; Mdn = median. MHC = Meaningful Human Control. Figure B11 Radar Chart of Mean Expert Ratings Across Five Validation Criteria Note. Outer ring = 7 (Strongly Agree); inner center = 1 (Strongly Disagree). Shaded area represents the mean rating profile. Dashed circle at 4.0 represents the neutral midpoint. Figure B12 Box Plots of Expert Ratings by Validation Criterion with Individual Data Points 437 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Note. Box plots display median (center line), interquartile range (box), and whiskers. Individual expert ratings shown as jittered dots. B.4.3 One-Sample t-Tests Against Neutral Midpoint One-sample t-tests were conducted to determine whether each validation criterion was rated significantly above the neutral midpoint of 4.0 on the 7-point scale. Table B29 presents the complete results. Table B29 One-Sample t-Test Results for Each Validation Criterion Against Neutral Midpoint (μ₀ = 4.0) Criterion M SD t df p (twotailed) Cohen's d 95% CI of Diff [LL, UL] Interpretation Operational Feasibility Doctrinal Compatibility Decision Traceability MHC Preservation Scalability 5.17 0.99 5.024 17 < .001 1.184 Large 5.50 0.79 8.098 17 < .001 1.909 5.83 0.62 12.579 17 < .001 2.965 5.56 1.20 5.504 17 < .001 1.297 4.72 1.18 2.600 17 .019 0.613 [0.68, 1.66] [1.11, 1.89] [1.53, 2.14] [0.96, 2.15] [0.14, 438 Large Large Large Medium DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 1.31] Note. Cohen's d benchmarks: small = 0.20, medium = 0.50, large = 0.80. CI = confidence interval of the mean difference from the neutral midpoint (4.0). df = 17 for all tests (N − 1 = 18 − 1). All five validation criteria were rated significantly above the neutral midpoint of 4.0. Traceability received the highest rating (M = 5.83, SD = 0.62), with a very large effect size (d = 2.97), indicating strong expert consensus that the DAM framework provides adequate decision traceability. Scalability received the lowest rating (M = 4.72, SD = 1.18), with a medium effect size (d = 0.61), suggesting that while experts view the framework as scalable, this represents an area requiring further development. B.4.4 Inter-Rater Reliability Inter-rater reliability was assessed using intraclass correlation coefficients (ICCs) computed under several models. Table B30 presents the complete ICC results. Table B30 Intraclass Correlation Coefficient Results for Expert Panel Ratings ICC Type ICC F df1 df2 p 95% CI ICC(1,1) ICC(A,1) ICC(C,1) ICC(1,k) ICC(A,k) ICC(C,k) 0.118 0.121 0.128 0.706 0.712 0.726 3.405 3.646 3.646 3.405 3.646 3.646 4 4 4 4 4 4 85 68 68 85 68 68 .012 .009 .009 .012 .009 .009 [0.01 0.6 ] [0.01 0.6 ] [0.01 0.62] [0.14 0.96] [0.18 0.96] [0.18 0.97] Note. ICC(1,1) = single measures, one-way random; ICC(A,1) = single measures, two-way random, absolute agreement; ICC(C,1) = single measures, two-way random, consistency; ICC(1,k) = average measures, one-way random; ICC(A,k) = average measures, two-way random, absolute agreement; ICC(C,k) = average measures, twoway random, consistency. The ICC results indicate moderate to good reliability depending on the model and measurement approach. The single-measures ICCs (ICC(1,1) = .118, ICC(A,1) = .121, ICC(C,1) 439 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 = .128) were low, which is expected given the heterogeneous expert panel with deliberately diverse professional backgrounds. The average-measures ICCs were substantially higher: ICC(1,k) = .706, ICC(A,k) = .712, ICC(C,k) = .726. The ICC(C,k) value of .726 indicates good inter-rater reliability for the averaged ratings, meeting the conventional threshold of .70 for acceptable reliability (Cicchetti, 1994). This suggests that while individual experts varied in their absolute ratings (reflecting their diverse professional perspectives), the overall pattern of ratings across criteria was consistent. Krippendorff’s alpha (ordinal) was computed as an additional reliability measure, yielding α = 0.058. This low value is attributable to the ordinal treatment of the data and the high variability among individual raters. Given the deliberately heterogeneous panel composition, the low single-measure ICCs and Krippendorff’s alpha are interpretable as reflecting genuine expert disagreement on specific criteria (notably Scalability and MHC Preservation) rather than measurement unreliability. B.4.5 Criterion Inter-Correlations Table B31 presents the Pearson correlation matrix among the five validation criteria, revealing patterns of association in expert ratings. Table B31 Pearson Correlation Matrix Among Five Validation Criteria (N = 18) Criterion 1 1. Feasibility — 2 3 2. Doctrinal Compatibility 3. Traceability 0.114 — 0.338 -0.061 — 4. MHC Preservation 0.315 0.125 0.132 440 4 — 5 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 5. Scalability -0.160 0.349 -0.552 0.032 — Note. Correlations based on N = 18 expert ratings. Given the small sample size, correlations should be interpreted with caution. The inter-correlation matrix reveals several notable patterns. Feasibility and Doctrinal Compatibility tended to co-vary, as experts who rated the framework as operationally feasible also viewed it as doctrinally compatible. Traceability showed positive associations with MHC Preservation, reflecting the conceptual link between decision audit trails and meaningful human control. Scalability showed more variable patterns, consistent with its status as the criterion with the greatest expert disagreement. Figure B13 Mean Validation Ratings by Expert Professional Background Category Note. Bars represent mean ratings within each background category. Error bars represent ± 1 SE. Small sample sizes per category warrant cautious interpretation. B.4.6 Qualitative Feedback Coding Summary 441 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Expert qualitative feedback was collected through structured open-ended prompts during the tabletop exercise. Thematic coding identified 10 major themes organized by valence (positive, concern, neutral). Table B32 presents the complete qualitative feedback coding summary. Table B32 Qualitative Feedback Themes from Expert Panel (N = 18) Theme Frequency Valence Representative Quote/Paraphrase Implementation Complexity 14 Concern Doctrine Alignment 12 Positive Accountability Clarity 15 Positive Trust Calibration Mechanism 11 Positive Scalability Concerns 13 Concern Legal/Ethical Compliance 10 Positive Operational Tempo Adaptability 9 Concern “Dynamic autonomy adjustment during active operations would require significant training investmen...” “The framework maps well to existing mission command philosophy, particularly the concept of disci...” “The explicit transfer-ofcontrol protocol provides a clear chain of accountability that addresses...” “The continuous trust calibration loop is the most innovative aspect— it creates a mechanism to pre...” “Scaling from a single platform to multi-domain operations with dozens of autonomous systems prese...” “The MHC preservation criteria provide a defensible framework for Article 36 weapons reviews and I...” “In high-tempo environments, the overhead of dynamic autonomy transitions may exceed available dec...” 442 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Interoperability Requirements 8 Concern Training Requirements 12 Neutral Adversarial Robustness 7 Concern “Integration with legacy C2 systems and allied force architectures would require standardized auto...” “Operators would need extensive training to develop calibrated mental models for when and how to a...” “The framework needs to address adversarial manipulation of autonomy triggers through cyber or ele...” Note. Frequency = number of experts whose feedback included this theme (out of 18). Valence categorized as Positive, Concern, Neutral, or Recommendation. The qualitative feedback revealed a nuanced picture of expert assessments. The most frequently mentioned theme was Accountability Clarity (n = 15), with a positive valence, indicating that experts viewed the explicit transfer-of-control protocol as a significant contribution. Implementation Complexity (n = 14) and Scalability Concerns (n = 13) were the most frequently cited concerns, consistent with the lower quantitative ratings on Scalability. The Trust Calibration Mechanism theme (n = 11) was identified as the most innovative aspect of the framework, with experts noting its potential to prevent both over-reliance and under-utilization of autonomous capabilities. B.5 Cross-Phase Statistical Integration This section integrates the quantitative findings across all research phases, providing convergence analysis, a research questions evidence matrix, and post-hoc power analysis. The purpose is to demonstrate the coherence and complementarity of the multi-phase design and to evaluate the strength of evidence supporting each research question. B.5.1 Convergence Analysis 443 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 Table B33 presents the cross-phase convergence analysis, mapping key findings across all four research phases and assessing the degree of convergence. Table B33 Cross-Phase Convergence Analysis: Key Findings Mapped Across Research Phases Finding Phase 1 Phase 2 Phase 3 Phase 4 Rating Accountability– Autonomy Tradeoff Threat Tempo Impact HOTL Superiority MHC Preservation Governance Framework Need Core qual. finding HITL 97.8% vs. HOVL 68.2% Trust: 5.42 → 3.77 Traceability = 5.83 Strong Time-critical emphasis Moderate doc. presence 19% doc. coding DoDD 3000.09 in 17.9% High sensitivity index Best composite success Governance rules Rules constrain behavior η²p = .671 (cog. load) Best balance across DVs Trust varies by arch. ROE: 78–96% by cond. Tempo concern noted Doctrinal compat. = 5.50 MHC = 5.56 Strong Moderate Doctrinal = 5.50 Strong Strong Note. Convergence rating: Strong = consistent evidence across all phases; Moderate = consistent across most phases with minor variations; Partial = some supporting evidence with notable gaps; Divergent = contradictory findings across phases. B.5.2 Research Questions Evidence Matrix Table B34 maps each research question to the specific statistical evidence supporting or refuting its associated hypotheses and propositions. Table B34 Research Questions Evidence Matrix: Statistical Support for Hypotheses and Propositions RQ RQ1 RQ1 Hypothesis Test Key Statistic p Effect Size Conclusion H1b: Dynamic allocation > static H1c: Dynamic > ABM comparison HOTL best composite — — Supported MANOVA Pillai's = 0.681 < .001 η²p = .34 Supported 444 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 static (human perf.) H2b: Graduated checkpoints RQ2 H2c: Higher perceived agency H3a: Sig. main effects of arch. H3b: HOTL/HOVL faster H3c: Arch. × complexity interaction RQ2 RQ3 RQ3 RQ3 ABM accountability < .001 d > 2.0 Supported Trust ANOVA HITL: 97.8%, HOVL: 68.2% F(2, 109) = 31.91 < .001 η²p = .37 Supported ANOVA (all DVs) All Fs significant < .05 .17–.73 Supported Response time ANOVA F(2, 109) = 147.21 < .001 η²p = .73 Supported Interaction Ftests F(4, 109) = 5.31 < .001 η²p = .16 Partially Supported Note. RQ = Research Question; H = Hypothesis. Effect sizes reported as partial eta-squared (η²p) or Cohen’s d. Conclusion categories: Supported = clear statistical evidence; Partially Supported = mixed or qualified evidence; Not Supported = no significant evidence. B.5.3 Statistical Power Analysis Post-hoc power analyses were conducted for key tests to evaluate the adequacy of sample sizes and the reliability of non-significant findings. Table B35 presents the results. Table B35 Post-Hoc Power Analysis for Key Statistical Tests Test Effect Size N α Power ANOVA: Autonomy → Response Time ANOVA: Tempo → Cognitive Load ANOVA: Autonomy → Trust ANOVA: Autonomy → Decision Accuracy ANOVA: Autonomy × Tempo η²p = .730 118 .05 > .999 η²p = .671 118 .05 > .999 η²p = .369 118 .05 > .999 η²p = .171 118 .05 > .999 η²p = .163 118 .05 > .990 445 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 → Cog. Load ANOVA: Autonomy × Tempo → Resp. Time ANOVA: Tempo → Trust (non-sig.) t-test: Traceability > 4.0 t-test: Scalability > 4.0 η²p = .085 118 .05 .870 η²p = .043 118 .05 .540 d = 2.965 18 .05 > .999 d = 0.613 18 .05 .720 Note. Power computed using post-hoc calculations based on observed effect sizes. α = Type I error rate. Power values > .80 are considered adequate (Cohen, 1988). The power analysis confirms that the Phase 3 experimental design was adequately powered for all significant effects, with observed power exceeding .99 for all large effects. The non-significant effect of threat tempo on trust score had relatively low power (.54), suggesting that a larger sample might detect this effect if it exists. For Phase 4, the t-test for Scalability had marginal power (.72), reflecting the small expert panel (N = 18) and relatively modest effect size. Future validation studies should consider panels of N ≥ 25 to achieve adequate power for medium effects. Overall, the power analysis supports the interpretability of the significant findings reported in Chapter 4. The few non-significant results (e.g., threat tempo on trust, several interaction effects) should be interpreted with appropriate caution given the power limitations, and future research with larger samples may reveal effects that the present design was underpowered to detect. 446 DYNAMIC AUTONOMY MANAGEMENT IN HUMAN-AI C2 447

(PDF) Dynamic Autonomy Management in Human-AI Command and Control for Autonomous Weapons Systems