Review the darkflow test code to understand how predictions are evaluated. Check existing metrics like `compute map` from TensorFlow or TensorFlow Object Detection API. Implement mean Average Precision as a new test option, likely in `test.py` or a similar script. Ensure compatibility with the model output format. Look at the 4 comments on the issue for any hints or prior attempts.