BSI-PT
Welcome to the BSI-PT project! This project implements the BSI-PT (Bayesian Strategy Inference plus Policy Tracking) framework and the experiments presented in the paper Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking.
In the experiments, we compare our BSI-PT framework with previous BPR variants including BPR+1, DPN-BPR+2, and Bayes-OKR3 in the extended batter vs. pitcher game (EBvPG) defined in the paper.
Authors
- Kuei-Tso Lee
- Yen-Yun Huang
- Je-Ruei Yang
- Sheng-Jyh Wang
Citation
Please site our paper if you find this project useful.
@article{lee2023opponent,
title={Opponent Exploitation Based on Bayesian Strategy Inference and Policy Tracking},
author={Lee, Kuei-Tso and Huang, Yen-Yun and Yang, Je-Ruei and Wang, Sheng-Jyh},
journal={IEEE Transactions on Games},
year={2023},
publisher={IEEE}
}
Getting Started
To reproduce the results in the paper, use the scripts/run_exps_and_plot.sh
script. There are two methods for setting up the environment to run the script.
Virtual Environment
Make sure you have Python 3.8+ installed, then set up a virtual environment using venv
and activate it.
python -m venv venv
source venv/bin/activate
Then install the dependencies in requirements.txt
using pip
.
pip install -r requirements.txt
Docker
First, build the image. Make sure you are in the root directory of this repository.
docker build -t bsi-pt .
Run a container using the image and mount the src/
directory to see the results.
docker run -it -v $(pwd)/src:/app/src bsi-pt
You will enter an interactive shell inside the container.
Run the Experiments
Run the experiments using the following command:
scripts/run_exps_and_plot.sh [SCENARIO] [NUM_RUNS] [NUM_EPISODE]
[SCENARIO]
defines the scenario the experiments will run on, the default is baseball (see extended batter vs. pitcher game (EBvPG)). [NUM_EPISODES]
defines the length of each game, and [NUM_RUNS]
defines the number of games to run.
The value of [NUM_RUNS]
is quite large in the paper (200,000 runs) since we want to ensure the results are statistically significant. We set the default of [NUM_RUNS]
to 1,000 in the project, which still requires a few minutes to complete. For a quick run, it's recommended to set it to a smaller number (e.g. 10). However, please note that the results may be quite unstable with a small number of runs.
# for a quick run, try this
scripts/run_exps_and_plot.sh baseball 10 20
The raw data will be stored in src/data/
in .pkl
format. The figures will be generated from those data and stored in src/fig/
, then the data points in the figures will be stored in CSV format in src/csv/
.
If you want to run the experiments individually or inspect each algorithm in detail, see Run Individual Experiment.
Extras
In addition to the EBvPG environment, which is under the baseball_game/
directory in this repository, we also implemented three other environments inspired by the Bayes-OKR3 paper, including Grid World, Navigation Game, and Soccer Game. In these three environments, Bayes-ToMoP4 and BSI (BSI-PT without policy tracking capability) agents are also presented.
-
Hernandez-Leal, Pablo, et al. "Identifying and tracking switching, non-stationary opponents: A Bayesian approach." (2016). ↩
-
Zheng, Yan, et al. "Efficient policy detecting and reusing for non-stationarity in Markov games." (2021). ↩
-
Chen, Hao, et al. "Accurate policy detection and efficient knowledge reuse against multi-strategic opponents." (2022). ↩↩
-
Yang, Tianpei, et al. "Towards efficient detection and optimal response against sophisticated opponents." (2018). ↩