This repository provides an end-to-end reinforcement learning framework for StarCraft II gameplay research. The core agent uses Proximal Policy Optimization (PPO) to learn policy networks that interpret observation data from the PySC2 environment and output precise in-game actions. Developers can configure neural network layers, reward shaping, and training schedules to optimize performance. The system supports multiprocessing for efficient sample collection, logging utilities for monitoring training curves, and evaluation scripts for running trained policies against scripted or built-in AI opponents. The codebase is written in Python and leverages TensorFlow for model definition and optimization. Users can extend components such as custom reward functions, state preprocessing, or network architectures to suit specific research objectives.