AI Tournament


Our very first AI Tournament just ended, and it was amazing!
Participants trained an AI algorithm to effectively play Dead Or Alive ++
The three best algorithms participated in the final event and competed for the 1400 CHF prize!


Updated every Tuesday
Watch the live event on Twitch

Final Standings

Rank Name Score
1 Kenshiro 12.73
2 rssalessio 7.70
3 Trrrrr 4.43

Tournament Standings

Rank Name Score
1 Kenshiro 12.65
2 Trrrrr 4.23
3 rssalessio 2.37
4 helga
5 Pierre
6 Anu
7 CharyMachine
8 Chicago Junkman
9 drlux
10 Ema42
11 Furio19
12 galacticor
13 HashArt
14 JeyDi92
15 King Arthur
16 muraatozbek
17 Noob
18 ralami
19 Riju06
20 robocop
41 wizardOfRobots
22 sash
23 Shivz
24 Sumitnc12594
25 TeamWindNet
26 The Trainer
27 Valetudo
28 VlaDiPooH
29 Water007
30 wegfawefgawefg
32 Maestrodigatto
33 mohithtaker
34 cblee
35 nviada
36 Djordje
37 My Joy
38 Kitadake
39 Rao
40 Amathlog
41 Sakthi
42 KingCob
43 geronimo
44 ankit7921
45 Quinton Starck
46 Alwahsh
47 Derezzed
48 Adxenix
49 syveqc
50 pastellic
51 ganman
52 junhill


A collection of the best episodes in which we evaluated AI algorithms submitted by participants.

Episode #43 | Dead Or Alive ++ AI Tournament - Evaluating Challengers' Algorithms #4

Episode #45 | Dead Or Alive ++ AI Tournament - Evaluating Challengers' Algorithms #5

Episode #51 | Dead Or Alive ++ AI Tournament - Evaluating Challengers' Algorithms #6

Episode #53 | Dead Or Alive ++ AI Tournament - Evaluating Challengers' Algorithms #7


Train an AI agent to effectively play Dead Or Alive ++.

It will face the standard COM player, in the run for completing the game: seven stages plus the final boss.

The environment, with Tournament-specific settings, can be found in the dedicated section of DIAMBRA repository here. Inside it, a complete submission example is also provided, featuring a baseline random agent.

The repository also contains python scripts and notebooks showing how to easily interact with it, together with a working implementation of a Reinforcement Learning agent as a starting baseline for reference.

Public leaderboard will be updated once a week, with submissions evaluation performed live and streamed on our Twitch channel every Tuesday at 10 PM CET

After competition end, the best agents will take part in the Final Event and evaluated live for the final standing definition. It will be broadcasted on Reinforcement Learning Zurich meetup

Getting started

Three easy steps:

1) Get the environment from GitHub
Inside you find python notebooks with everything you need: from environment usage up to a working baseline PPO RL agent

2) Join the discord server
You'll meet other participants and the whole community, with a dedicated channel for support

3) Refer to "Getting started" video tutorials here
They provide additional info on coding aspects and implementation details

and you are ready to fight.


Start Date April 20th, 2021
Competition Presentation Event April 27th, 2021 - RLZ Meetup
Weekly Submission Deadline Every Sunday 11:59 PM UTC
Weekly Submissions Evaluation Twitch Live every Tuesday 10 PM CET
Final Submission Deadline and End Date July 5th, 2021 11:59 PM UTC
Final Live Event July 7th, 2021 - RLZ Meetup


Kindly offered by Reinforcement Learning Zurich!

1st Place 1000 CHF
2nd Place 300 CHF
3rd Place 100 CHF


An example of a valid submission can be found in DIAMBRA Environment repository here. It is composed by:

1) An python file containing a python class with the proper constructor, a reset(self) method called when env.reset() is called, and a act(self, observation, info) method returning actions to be executed

2) A file containing trained agent model parameters (e.g. policy network weights, if any)

3) A requirements.txt file containing python prerequisites to create the virtual environment (either virtualenv or conda) containing all modules to run the agent (if any)

4) A packages.txt file containing OS packages dependencies required to run the agent (if any)


The evaluation metric is the total cumulative reward collected by the agent in an episode. Thus, every contender will be able to estimate its own performance simply running the agent in local.

Environment default reward function will be used to evaluate submissions: it is directly proportional to characters health variation (positive when the agent hits the opponent and negative otherwise).

Agent performances, averaged on five runs, will be used to define leaderboard standings.

Technical Details

Observation space
A stack of the last four game frames (pixels) plus a fifth additional channel containing the following complementary info (numerical data): last 12 actions (one-hot encoding), own and opponent health, own and opponent side (Left/Right), stage number and selected character (one-hot encoding).

Action spaces
Four possible action spaces, depending on the choice of Discrete VS Multi Discrete and With VS Without Attack Buttons combinations, resulting in 12, 16, 36, 72 different possible actions.


Competition related
- Only one account per participant allowed
- Participants are not allowed to share code
- If participants want to team up, they are allowed to submit their agent from a single user profile belonging to one member of the team
- Maximum two members per team
- Only one submission per week allowed (the latest one will be considered)
- Only one final submission allowed (the latest one will be considered)

Game related
- Every available character can be used
- The agent can be character-specific

Software related
- Action space can be chosen by participants as they prefer, selecting it among previously described ones
- Observations can be modified by participants inside the agent act(self, observation) method, before feeding it to the agent policy, but it has to mantain an interface with the observation format described above
- The submitted agent must be compatible with Linux Ubuntu 18.04 OS
- Maximum GPU RAM usage for inference: 1.25 GB
- Maximum inference time (considering also observations modifications, if any): 50 ms on Nvidia 1050

Failing to comply with these rules will result in agent performance discarding (if violating RAM usage for example) and/or participant disqualification (if violating competition rules).

Competition-Specific Terms
Competition Name (the "Competition"): AI Tournament
Competition Organizer: DIAMBRA | Dueling AI Arena
Competition Sponsor: Reinforcement Learning Zurich (Nonprofit)
Competition Website:

Prizes: 1400 CHF
- First Prize: 1000 CHF
- Second Prize: 300 CHF
- Third Prize: 100 CHF

WINNER LICENSE TYPE: Non-Exclusive License

Competitions are open worldwide, except that if you are a resident of Crimea, Cuba, Iran, Syria, North Korea, Sudan, or are subject to Italian/EU export controls or sanctions, you may not enter the Competition. Other local rules and regulations may apply to you, so please check your local laws to ensure that you are eligible to participate in skills-based competitions. The Competition Sponsor reserves the right to award alternative prizes where needed to comply with local laws.

As specified on Terms of Use (Section 8) and General Competition Rules (Section 7D), Environments provided by Competition Organizer and/or Competition Sponsor, are a mere software interface to existing videogames, and they cannot work as a standalone application. As such, they require the User, or the Competition Participant, to own software elements protected by copyright, and to interface them with the Environment itself. It is the case, for example, of Game ROMS required to execute the correspondent Game-related Environment. In such cases, it is sole an only responsibility of the User, or Competition Participant, to comply with all the laws and regulations, and to make sure he has the right to use such copyright-protected material. Competition Organizer and Competition Sponsor will spend their maximum effort in avoiding illegal distribution of such material, and are by no mean responsible for copyright infringement.

Entry in this competition constitutes your acceptance of these official competition rules.

The Competition named above is a skills-based competition to promote and further the field of machine learning. You must register via the Competition Website to enter. Your competition submissions ("Submissions") must conform to the requirements stated on the Competition Website. Your Submissions will be scored based on the evaluation metric described on the Competition Website. Subject to compliance with the Competition Rules, Prizes described on the Competition Website, if any, will be awarded to participants with the best scores, based on the merits of the machine learning models submitted. See also General Competition Rules