Mastering the game of Go without human knowledge - PubMed
- ️Sun Jan 01 2017
. 2017 Oct 18;550(7676):354-359.
doi: 10.1038/nature24270.
Julian Schrittwieser 1 , Karen Simonyan 1 , Ioannis Antonoglou 1 , Aja Huang 1 , Arthur Guez 1 , Thomas Hubert 1 , Lucas Baker 1 , Matthew Lai 1 , Adrian Bolton 1 , Yutian Chen 1 , Timothy Lillicrap 1 , Fan Hui 1 , Laurent Sifre 1 , George van den Driessche 1 , Thore Graepel 1 , Demis Hassabis 1
Affiliations
- PMID: 29052630
- DOI: 10.1038/nature24270
Mastering the game of Go without human knowledge
David Silver et al. Nature. 2017.
Abstract
A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo's own move selections and also the winner of AlphaGo's games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100-0 against the previously published, champion-defeating AlphaGo.
Comment in
-
Artificial intelligence: Learning to play Go from scratch.
Singh S, Okun A, Jackson A. Singh S, et al. Nature. 2017 Oct 18;550(7676):336-337. doi: 10.1038/550336a. Nature. 2017. PMID: 29052631 No abstract available.
Similar articles
-
Mastering the game of Go with deep neural networks and tree search.
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, van den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D. Silver D, et al. Nature. 2016 Jan 28;529(7587):484-9. doi: 10.1038/nature16961. Nature. 2016. PMID: 26819042
-
A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play.
Silver D, Hubert T, Schrittwieser J, Antonoglou I, Lai M, Guez A, Lanctot M, Sifre L, Kumaran D, Graepel T, Lillicrap T, Simonyan K, Hassabis D. Silver D, et al. Science. 2018 Dec 7;362(6419):1140-1144. doi: 10.1126/science.aar6404. Science. 2018. PMID: 30523106
-
Google AI algorithm masters ancient game of Go.
Gibney E. Gibney E. Nature. 2016 Jan 28;529(7587):445-6. doi: 10.1038/529445a. Nature. 2016. PMID: 26819021 No abstract available.
-
Yoshida H. Yoshida H. Brain Nerve. 2019 Jul;71(7):681-694. doi: 10.11477/mf.1416201340. Brain Nerve. 2019. PMID: 31289242 Review. Japanese.
-
Looking to the future: Learning from experience, averting catastrophe.
Carpenter GA. Carpenter GA. Neural Netw. 2019 Dec;120:5-8. doi: 10.1016/j.neunet.2019.09.018. Epub 2019 Oct 10. Neural Netw. 2019. PMID: 31607596 Review.
Cited by
-
Bülow RD, Dimitrov D, Boor P, Saez-Rodriguez J. Bülow RD, et al. Semin Immunopathol. 2021 Oct;43(5):739-752. doi: 10.1007/s00281-021-00847-y. Epub 2021 Apr 9. Semin Immunopathol. 2021. PMID: 33835214 Free PMC article. Review.
-
Levy J, Mussack D, Brunner M, Keller U, Cardoso-Leite P, Fischbach A. Levy J, et al. Front Psychol. 2020 Aug 21;11:2190. doi: 10.3389/fpsyg.2020.02190. eCollection 2020. Front Psychol. 2020. PMID: 32973639 Free PMC article.
-
A deep learning approach to identify unhealthy advertisements in street view images.
Palmer G, Green M, Boyland E, Vasconcelos YSR, Savani R, Singleton A. Palmer G, et al. Sci Rep. 2021 Mar 1;11(1):4884. doi: 10.1038/s41598-021-84572-4. Sci Rep. 2021. PMID: 33649490 Free PMC article.
-
Cecil RM, Sugden LA. Cecil RM, et al. PLoS Comput Biol. 2023 Nov 27;19(11):e1010979. doi: 10.1371/journal.pcbi.1010979. eCollection 2023 Nov. PLoS Comput Biol. 2023. PMID: 38011281 Free PMC article.
-
Zhang H, Zhou H, Wei Y, Huang C. Zhang H, et al. Front Neurorobot. 2022 Oct 25;16:996412. doi: 10.3389/fnbot.2022.996412. eCollection 2022. Front Neurorobot. 2022. PMID: 36386393 Free PMC article.
References
MeSH terms
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous