BSI-PT

Welcome to the BSI-PT project! This project implements the BSI-PT (Bayesian Strategy Inference plus Policy Tracking) framework and the experiments presented in the paper Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking.

In the experiments, we compare our BSI-PT framework with previous BPR variants including BPR+¹, DPN-BPR+², and Bayes-OKR³ in the extended batter vs. pitcher game (EBvPG) defined in the paper.

Authors

Kuei-Tso Lee
Yen-Yun Huang
Je-Ruei Yang
Sheng-Jyh Wang

Citation

Please site our paper if you find this project useful.

@article{lee2023opponent,
  title={Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking},
  author={Lee, Kuei-Tso and Huang, Yen-Yun and Yang, Je-Ruei and Wang, Sheng-Jyh},
  journal={IEEE Transactions on Games},
  year={2023},
  publisher={IEEE}
}

Getting Started

To reproduce the results in the paper, use the scripts/run_exps_and_plot.sh script. There are two methods for setting up the environment to run the script.

Virtual Environment

Make sure you have Python 3.8+ installed, then set up a virtual environment using venv and activate it.

python -m venv venv
source venv/bin/activate

Then install the dependencies in requirements.txt using pip.

pip install -r requirements.txt

Docker

First, build the image. Make sure you are in the root directory of this repository.

docker build -t bsi-pt .

Run a container using the image and mount the src/ directory to see the results.

docker run -it -v $(pwd)/src:/app/src bsi-pt

You will enter an interactive shell inside the container.

Run the Experiments

Run the experiments using the following command:

scripts/run_exps_and_plot.sh [SCENARIO] [NUM_RUNS] [NUM_EPISODE]

[SCENARIO] defines the scenario the experiments will run on, the default is baseball (see extended batter vs. pitcher game (EBvPG)). [NUM_EPISODES] defines the length of each game, and [NUM_RUNS] defines the number of games to run.

The value of [NUM_RUNS] is quite large in the paper (200,000 runs) since we want to ensure the results are statistically significant. We set the default of [NUM_RUNS] to 1,000 in the project, which still requires a few minutes to complete. For a quick run, it's recommended to set it to a smaller number (e.g. 10). However, please note that the results may be quite unstable with a small number of runs.

# for a quick run, try this
scripts/run_exps_and_plot.sh baseball 10 20

The raw data will be stored in src/data/ in .pkl format. The figures will be generated from those data and stored in src/fig/, then the data points in the figures will be stored in CSV format in src/csv/.

If you want to run the experiments individually or inspect each algorithm in detail, see Run Individual Experiment.

Extras

In addition to the EBvPG environment, which is under the baseball_game/ directory in this repository, we also implemented three other environments inspired by the Bayes-OKR³ paper, including Grid World, Navigation Game, and Soccer Game. In these three environments, Bayes-ToMoP⁴ and BSI (BSI-PT without policy tracking capability) agents are also presented.

Hernandez-Leal, Pablo, et al. "Identifying and tracking switching, non-stationary opponents: A Bayesian approach." (2016). ↩
Zheng, Yan, et al. "Efficient policy detecting and reusing for non-stationarity in Markov games." (2021). ↩
Chen, Hao, et al. "Accurate policy detection and efficient knowledge reuse against multi-strategic opponents." (2022). ↩↩
Yang, Tianpei, et al. "Towards efficient detection and optimal response against sophisticated opponents." (2018). ↩