- Pre-built algorithms: Q-learning, Monte Carlo, value iteration, policy iteration
- Multiple sample environments: GridWorld, MountainCar, Multi-Armed Bandits
- Uniform agent-environment interface with base classes
- Utility functions for logging, performance tracking, and visualization
- Modular and extensible design for custom agents/environments