# Dynamic Target Surveillance under Ballistic Threat

The Army has the challenge of monitoring a collection of targets while under line-of-sight ballistic threat. Uncertainties in both the environment and the behavior of threat sources make it difficult to develop optimal surveillance strategies for such missions. The goal of this research is to develop a planning framework for target surveillance that accounts for these uncertainties.

This work formulates the target surveillance problem as a Partially Observable Markov Decision Process (POMDP). A POMDP is composed of a state transition function that captures the uncertainty in the dynamics of the environment, a reward function that can be used to balance safety and operational objectives, and an observation function that represents the distribution over observations given the current state of the environment. The solution to a POMDP is a dynamic strategy that reacts to new observations to maximize the expected cumulative reward over time (Kochenderfer, 2015). The challenge with the target surveillance problem is that it is very high dimensional, and naïve approaches to solving the POMDP scale poorly.

This research attempts to overcome the high dimensionality of the problem through two approaches. The first approach utilizes parallel compute resources to determine approximate POMDP solutions. This approach focuses on both distributed and shared memory resources as well as heterogeneous CPU/GPU computing platforms. Online solution methods such as Monte-Carlo Tree Search (Browne, 2012) are parallelizable over large clusters with distributed memory. Offline solution methods based on dynamic programming such as value iteration can be parallelized on GPUs. By combining online and offline solution methods on parallel compute resources, this work aims to solve the target surveillance POMDP in near-real time.

The second approach explores generalization of POMDP solutions with smaller state spaces to larger unexplored domains. This approach explores feature similarities between urban environments where target surveillance missions might take place. Plans developed for some set of urban environments can be adapted to new but similar environments with minimal or no re-planning. Such generalization can both reduce the planning time and allow solutions for larger domains that are intractable for offline methods.

Current work at the Army Research Lab (ARL) is focused on exploring high performance computing architectures for efficiently computing ballistic threat probability fields (Richie et al., 2013). This research incorporates the tools developed at the ARL into the target surveillance planning framework to efficiently evaluate the POMDP reward function. Performance metrics of strategies obtained with the framework are compared to heuristic strategies in order to quantify the benefits of the approach.

The objective of this work is to frame the problem of optimal dynamic target surveillance under ballistic threat as a sequential decision problem that can be solved computationally. Combined with quad-tree ray tracing tools developed by the ARL, the algorithms developed as part of this research will enable the creation of real-time decision support tools for use by peacekeeping forces.

References:

M. Kochenderfer, *Decision Making Under Uncertainty: Theory and Application*. MIT Press, 2015.

C.B. Browne, E. Powley, D. Whitehouse, S.M. Lucas, P.I. Cowling, P. Rohlfshagen, S. Tavener, D. Perez, S. Samothrakis, and S. Colton, “A Survey of Monte Carlo Tree Search Methods,” IEEE Transactions on Computational Intelligence and AI in Games, vol. 4, no. 1, pp. 1–43, 2012.

D. A. Richie, J. A. Ross, S. J. Park, and D. R. Shires, “Ray-tracing-based geospatial optimization for heterogeneous architectures enhancing situational awareness" 2013 IEEE 16th International Conference on Computational Science and Engineering, pp. 81-86