October 18, 2017
The latest AI can work things out without being taught

IN 2016 Lee Sedol, one of the world’s best players of Go, lost a match in Seoul to a computer program called AlphaGo by four games to one. It was a big event, both in the history of Go and in the history of artificial intelligence (AI). Go occupies roughly the same place in the culture of China, Korea and Japan as chess does in the West. After its victory over Mr Lee, AlphaGo beat dozens of renowned human players in a series of anonymous games played online, before re-emerging in May to face Ke Jie, the game’s best player, in Wuzhen, China. Mr Ke fared no better than Mr Lee, losing to the computer 3-0.
For AI researchers, Go is equally exalted. Chess fell to the machines in 1997, when Garry Kasparov lost a match to Deep Blue, an IBM computer. But until Mr Lee’s defeat, Go’s complexity had made it resistant to the march of machinery. AlphaGo’s victory was an eye-catching demonstration of the power of a type of AI called machine learning, which aims to get computers to teach complicated tasks to themselves.
AlphaGo learned to play Go by studying thousands of games between expert human opponents, extracting rules and strategies from those games and then refining them in millions more matches which the program played against itself. That was enough to make it stronger than any human player. But researchers at DeepMind, the firm that built AlphaGo, were confident that they could improve it. In a paper just published in Nature they have unveiled the latest version, dubbed AlphaGo Zero. It is much better at the game, learns to play much more quickly and requires far less computing hardware to do well. Most important, though, unlike the original version, AlphaGo Zero has managed to teach itself the game without recourse to human experts at all.

The eyes have it
Like all the best games, Go is easy to learn but hard to master. Two players, Black and White, take turns placing stones on the intersections of a board consisting of 19 vertical lines and 19 horizontal ones. The aim is to control more territory than your opponent. Stones that are surrounded by an opponent’s are removed from the board. Players carry on until neither wishes to continue. Each then adds the number of his stones on the board to the number of empty grid intersections he has surrounded. The larger total is the winner.
The difficulty comes from the sheer number of possible moves. A 19x19 board offers 361 different places on which Black can put the initial stone. White then has 360 options in response, and so on. The total number of legal board arrangements is in the order of 10170, a number so large it defies any physical analogy (there are reckoned to be about 1080 atoms in the observable universe, for instance).
Human experts focus instead on understanding the game at a higher level. Go’s simple rules give rise to plenty of emergent structure. Players talk of features such as “eyes” and “ladders”, and of concepts such as “threat” and “life-and-death”. But although human players understand such concepts, explaining them in the hyper-literal way needed to program a computer is much harder. Instead, the original AlphaGo studied thousands of examples of human games, a process called supervised learning. Since human play reflects human understanding of such concepts, a computer exposed to enough of it can come to understand those concepts as well. Once AlphaGo had arrived at a decent grasp of tactics and strategy with the help of its human teachers, it kicked away its crutches and began playing millions of unsupervised training games against itself, improving its play with every game.
Supervised learning is useful for much more than Go. It is the basic idea behind many of the recent advances in AI, helping computers learn to do things such as identify faces in pictures, recognise human speech reliably, filter spam from e-mail efficiently and more. But as Demis Hassabis, Deepmind’s boss, observes, supervised learning has limits. It relies on the availability of training data to feed to the computer to show the machine what it is meant to be doing. Such data must be filtered by human experts. The training data for face recognition, for instance, consist of thousands of pictures, some with faces and some without, each labelled as such by a person. That makes such data sets expensive, assuming they are available at all. And, as the paper points out, there can be more subtle problems. Relying on human experts for guidance risks imposing human limits on a computer’s ability.
AlphaGo Zero is designed to avoid all these problems by skipping the training-wheels phase entirely. The program starts only with the rules of the game and a “reward function”, which awards it a point for a win and docks a point for a loss. It is then encouraged to experiment, repeatedly playing games against other versions of itself, subject only to the constraint that it must try to maximise its reward by winning as much as possible.
The program started by placing stones randomly, with no real idea of what it was doing. But it improved rapidly. After a single day it was playing at the level of an advanced professional. After two days it had surpassed the performance of the version that beat Mr Lee in 2016.
DeepMind’s researchers were able to watch their creation rediscover the Go knowledge that human beings have accumulated over thousands of years. Sometimes, it seemed eerily human-like. After about three hours of training the program was preoccupied with the idea of greedily capturing stones, a phase that most human beginners also go through. At others it seemed decidedly alien. For example, ladders are patterns of stones that extend in a diagonal slash across the board as one player attempts to capture a group of his opponent’s stones. They are frequent features of Go games. Because a ladder consists of a simple, repeating pattern, human novices quickly learn to extrapolate them and work out if building a particular ladder will succeed or fail. But AlphaGo Zero—which is not capable of extrapolation, and instead experiments with new moves semi-randomly—took longer than expected to come to grips with the concept.

Climbing the ladder
Nevertheless, learning for itself rather than relying on hints from people seemed, on balance, to be a big advantage. For example, joseki are specialised sequences of well-known moves that take place near the edges of the board. (Their scripted nature makes them a little like chess openings.) AlphaGo Zero discovered the standard joseki taught to human players. But it also discovered, and eventually preferred, several others that were entirely of its own invention. The machine, says David Silver, who led the AlphaGo project, seemed to play with a distinctly non-human style.
The result is a program that is not just superhuman, but crushingly so. Skill at Go (and chess, and many other games) can be quantified with something called an Elo rating, which gives the probability, based on past performance, that one player will beat another. A player has a 50:50 chance of beating an opponent with the same Elo rating, but only a 25% chance of beating one with a rating 200 points higher. Mr Ke has a rating of 3,661. Mr Lee’s is 3,526. After 40 days of training AlphaGo Zero had an Elo rating of more than 5,000—putting it as far ahead of Mr Ke as Mr Ke is of a keen amateur, and suggesting that it is, in practice, impossible for Mr Ke, or any other human being, ever to defeat it. When it played against the version of AlphaGo that first beat Mr Lee, it won by 100 games to zero.
There is, of course, more to life than Go. Algorithms such as the ones that power the various iterations of AlphaGo might, its creators hope, be applied to other tasks that are conceptually similar. (DeepMind has already used those that underlie the original AlphaGo to help Google slash the power consumption of its data centres.) But an algorithm that can learn without guidance from people means that machines can be let loose on problems that people do not understand how to solve. Anything that boils down to an intelligent search through an enormous number of possibilities, said Mr Hassabis, could benefit from AlphaGo’s approach. He cited classic thorny problems such as working out how proteins fold into their final, functional shapes, predicting which molecules might have promise as medicines, or accurately simulating chemical reactions.
Advances in AI often trigger worries about human obsolescence. DeepMind hopes such machines will end up as assistants to biological brains, rather than replacements for them, in the way that other technologies from search engines to paper have done. Watching a machine invent new ways to tackle a problem can, after all, help push people down new and productive paths. One of the benefits of AlphaGo, says Mr Silver, is that, in a game full of history and tradition, it has encouraged human players to question the old wisdom, and to experiment. After losing to AlphaGo, Mr Ke studied the computer’s moves, looking for ideas. He then went on a 22-game winning streak against human opponents, an impressive feat even for someone of his skill. Supervised learning, after all, can work in both directions.
Latest News
Top news around the world
Academy Awards

‘Oppenheimer’ Reigns at Oscars With Seven Wins, Including Best Picture and Director

Get the latest news about the 2024 Oscars, including nominations, winners, predictions and red carpet fashion at 96th Academy Awards

Around the World

Celebrity News

> Latest News in Media

Watch It
JoJo Siwa Reveals She Spent $50k on This Cosmetic Procedure
April 08, 2024
tilULujKDIA
Gypsy Rose Blanchard Files for Divorce from Ryan Anderson
April 08, 2024
kjqE93AL4AM
Bachelor Nation’s Trista Sutter Shares Update on Husband’s Battle With Lyme Disease | E! News
April 08, 2024
mNBxwEpFN4Y
Alan Tudyk Does All His Disney Voices
April 08, 2024
fkqBY4E9QPs
Bob Iger responds to critics who call Disney "too woke"
April 06, 2024
loZMrwBYVbI
Kirsten Dunst recites a classic cheer from 'Bring it On'
April 06, 2024
VHAca3r0t-k
Dr. Paul Nassif Offers Up Plastic Surgery Warning for Gypsy Rose Blanchard | TMZ
April 09, 2024
cXIyPm8mKGY
Reba McEntire Laughs at Joy Behar's Suggestion 'Jolene' is Anti-Feminist | TMZ TV
April 08, 2024
11Cyp1sH14I
NeNe Leakes Says She's Okay with Cheating If It's Done Respectfully | TMZ TV
April 08, 2024
IsjAeJFgwhk
Ben Affleck and Jennifer Lopez’s wedding was 20 years in the making
April 08, 2024
BU8hh19xtzA
Bianca Censori wears completely sheer tube dress and knee-high stockings for Kanye West outing
April 08, 2024
IkbdMacAuhU
Kelsea Ballerini tells trolls to ‘shut up’ about pantsless CMT Music Awards 2024 performance #shorts
April 08, 2024
G4OSTYyXcOc
TV Schedule
Late Night Show
Watch the latest shows of U.S. top comedians

Sports

Latest sport results, news, videos, interviews and comments
Latest Events
08
Apr
ITALY: Serie A
Udinese - Inter Milan
07
Apr
ENGLAND: Premier League
Manchester United - Liverpool
07
Apr
ENGLAND: Premier League
Tottenham Hotspur - Nottingham Forest
07
Apr
ITALY: Serie A
Juventus - Fiorentina
07
Apr
ENGLAND: Premier League
Sheffield United - Chelsea
07
Apr
ITALY: Serie A
Monza - Napoli
07
Apr
GERMANY: Bundesliga
Wolfsburg - Borussia Monchengladbach
07
Apr
ITALY: Serie A
Verona - Genoa
07
Apr
ITALY: Serie A
Cagliari - Atalanta
07
Apr
GERMANY: Bundesliga
Hoffenheim - Augsburg
07
Apr
ITALY: Serie A
Frosinone - Bologna
06
Apr
GERMANY: Bundesliga
Heidenheim - Bayern Munich
06
Apr
GERMANY: Bundesliga
Borussia Dortmund - Stuttgart
06
Apr
ENGLAND: Premier League
Brighton - Arsenal
06
Apr
ITALY: Serie A
Roma - Lazio
06
Apr
ENGLAND: Premier League
Crystal Palace - Manchester City
06
Apr
ITALY: Serie A
AC Milan - Lecce
04
Apr
ENGLAND: Premier League
Chelsea - Manchester United
04
Apr
ENGLAND: Premier League
Liverpool - Sheffield United
03
Apr
ENGLAND: Premier League
Arsenal - Luton
03
Apr
ENGLAND: Premier League
Manchester City - Aston Villa
02
Apr
ENGLAND: Premier League
West Ham United - Tottenham Hotspur
01
Apr
SPAIN: La Liga
Villarreal - Atletico Madrid
01
Apr
ITALY: Serie A
Lecce - Roma
01
Apr
ITALY: Serie A
Inter Milan - Empoli
31
Mar
ENGLAND: Premier League
Manchester City - Arsenal
31
Mar
SPAIN: La Liga
Real Madrid - Athletic Bilbao
31
Mar
ENGLAND: Premier League
Liverpool - Brighton
30
Mar
SPAIN: La Liga
Barcelona - Las Palmas
30
Mar
ENGLAND: Premier League
Brentford - Manchester United
30
Mar
ITALY: Serie A
Fiorentina - AC Milan
Find us on Instagram
at @feedimo to stay up to date with the latest.
Featured Video You Might Like
zWJ3MxW_HWA L1eLanNeZKg i1XRgbyUtOo -g9Qziqbif8 0vmRhiLHE2U JFCZUoa6MYE UfN5PCF5EUo 2PV55f3-UAg W3y9zuI_F64 -7qCxIccihU pQ9gcOoH9R8 g5MRDEXRk4k
Copyright © 2020 Feedimo. All Rights Reserved.