is Íslenska en English

Lokaverkefni (Bakkalár)

Háskólinn í Reykjavík > Tæknisvið / School of Technology > BSc Tölvunarfræðideild / Department of Computer Science >

Vinsamlegast notið þetta auðkenni þegar þið vitnið til verksins eða tengið í það: https://hdl.handle.net/1946/44332

Titill: 
  • Titill er á ensku Expediting self-play learning in AlphaZero-style game-playing agents
Námsstig: 
  • Bakkalár
Leiðbeinandi: 
Útdráttur: 
  • Útdráttur er á ensku

    One of the main appeals of AlphaZero-style game-playing agents, which combine deep learning and Monte Carlo Tree Search, is that they can be trained autonomously without external expert-level domain knowledge. However, training such agents is generally computationally expensive, with the most computationally time-consuming step being generating training data via self-play. Here we propose an improved strategy for generating self-play training data, resulting in higher-quality samples, especially in earlier training phases. The new strategy initially emphasizes the latter game phases and gradually extends those phases to entire games as the training progresses. In our test domains, the games Connect4 and Breakthrough, we show that game-playing agents using the improved training approach learn significantly faster than counterpart agents using a standard approach. Furthermore, we empirically show that the proposed strategy is (in our test domains) superior to several recently proposed strategies for expediting self-play learning in game playing.

Samþykkt: 
  • 17.5.2023
URI: 
  • http://hdl.handle.net/1946/44332


Skrár
Skráarnafn Stærð AðgangurLýsingSkráartegund 
Expediting_Self_Play_Learning_in_AlphaZero_Style_Game_Playing_Agents.pdf969.3 kBOpinnHeildartextiPDFSkoða/Opna