Week Ending 09.28.2018
RESEARCH WATCH: 10.07.2018
Over the week ending September 28, 37 new papers were published in "Computer Science".
The paper discussed most in the news over the week ending September 28 was "Jointly Discovering Visual Objects and Spoken Words from Raw Sensory Input" by David Harwath et al (Apr 2018), which was referenced 26 times, including in the article MIT ups the ante in getting one AI to teach another in ZDNet. The paper author, David Harwath (Researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and the Spoken Language Systems Group), was quoted saying "We wanted to do speech recognition in a way that’s more natural, leveraging additional signals and information that humans have the benefit of using, but that machine learning algorithms don’t typically have access to. We got the idea of training a model in a manner similar to walking a child through the world and narrating what you’re seeing". The paper got social media traction with 48 shares. The researchers explore neural network models that learn to associate segments of spoken audio captions with the semantically relevant portions of natural images that they refer to.
The paper shared the most on social media this week is "Covfefe: A Computer Vision Approach For Estimating Force Exertion" by Vaneet Aggarwal et al (Sep 2018) with 57 shares. The investigators use face videos and the photoplethysmography (PPG) signals to classify force exertion levels of 0\ %, 50\ %, and 100\ % (representing rest, moderate effort, and high effort), thus providing a non - intrusive and scalable approach.
Over the week ending September 28, 50 new papers were published in "Computer Science - Artificial Intelligence".
The paper discussed most in the news over the week ending September 28 was by a team at IBM: "Automated Test Generation to Detect Individual Discrimination in AI Models" by Aniya Agarwal et al (Sep 2018), which was referenced 3 times, including in the article Trust and Transparency for AI on the IBM Cloud in WebWire. The paper got social media traction with 5 shares. The authors address the problem of detecting whether a model has an individual discrimination.
Over the week ending September 28, 19 new papers were published in "Computer Science - Computers and Society".
The paper discussed most in the news over the week ending September 28 was "A Quantitative Approach to Understanding Online Antisemitism" by Joel Finkelstein et al (Sep 2018), which was referenced 133 times, including in the article Online communities see large growth in anti-Semitic comments, memes in UAB Kaleidoscope. The paper author, Jeremy Blackburn (University of Alabama at Birmingham), was quoted saying "There may be 100 racists in your town, but in the past they would have to find each other in the real world. Now they just go online". The paper got social media traction with 43 shares. The researchers present a large-scale, quantitative study of online antisemitism.
The paper shared the most on social media this week is by a team at University of Washington: "Over-Optimization of Academic Publishing Metrics: Observing Goodharts Law in Action" by Michael Fire et al (Sep 2018) with 55 shares. The researchers analyzed over 120 million papers to examine how the academic publishing world has evolved over the last century.
Over the week ending September 28, 37 new papers were published in "Computer Science - Learning".
The paper discussed most in the news over the week ending September 28 was by a team at Georgia Institute of Technology: "Co-Creative Level Design via Machine Learning" by Matthew Guzdial et al (Sep 2018), which was referenced 1 time, including in the article Forget dumping games designers for AI - turns out it takes two to tango in The Register. The paper got social media traction with 21 shares.
This week was very active for "Computer Science - Robotics", with 79 new papers.
The paper discussed most in the news over the week ending September 28 was by a team at University of Lincoln: "Agricultural Robotics: The Future of Robotic Agriculture" by Tom Duckett et al (Jun 2018), which was referenced 12 times, including in the article Scottish Farmers Test Machine Vision to Manage Pig Pugnacity in Spectrum Online. The paper got social media traction with 20 shares.
The paper shared the most on social media this week is "Real-Time Monocular Object-Model Aware Sparse SLAM" by Mehdi Hosseinzadeh et al (Sep 2018) with 87 shares.
Over the week ending September 28, 15 new papers were published in "Computer Science - Neural and Evolutionary Computing".
The paper discussed most in the news over the week ending September 28 was by a team at Michigan State University: "The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities" by Joel Lehman et al (Mar 2018), which was referenced 19 times, including in the article The Spooky Genius of Artificial Intelligence in Atlantic.com - The Wire. The paper also got the most social media traction with 4042 shares.
Goodhart’s law affects computers too.
A common method in machine learning is to specify a “cost function” that measures how good an action is and use this to train an agent towards desired behavior.
For example, you might specify a cost function for a “simple” task like walking with the following:
The above criteria are not enough since a robot rolling on its side while moving its legs meets all three but is certainly not “walking”.
In fact, mathematically specifying vague human ideas in general is actually extremely hard. Cost functions often have bugs or incentivize subtly wrong actions.
This paper collects stories from multiple labs across academia and industry on evolutionary algorithms acting in unexpected ways.
Note that computers (for now) always do exactly what they are told. Evolutionary algorithms produce interesting results simply due to mis-specified instructions, rather than any sort of “defiant” behavior in the program.
In 1975, famed economist Charles Goodhart coined Goodhart’s Law which states that “when a measure becomes a target, it ceases to be a good measure.”
This is similar to the idea of unintended consequences that result from people following a law to the letter while completely violating its spirit.
One anecdote from British India.
Delhi suffered from venomous cobras, so the government put out a bounty for dead snake skins to try to eliminate the problem.
Citizens realized they could breed the snakes for massive profit instead of actually hunting the cobras.
The government ended the program after they realized the ruse, but this then induced the breeders to release their (now useless) snakes into the wild. Thus, the policy ended up worsening the cobra problem.
Comparisons to humans
Usually, human norms and culture will prevent such a disaster from occurring, with the exception of legal issues (like the cobra example above) where the letter of the law (not its spirit) matters most.
If a CEO casually requests her sales team to increase revenue by a million dollars next quarter, this does not include breaking into a bank and stealing the money, even though this is never explicitly forbidden.
Coding human norms into computer algorithms is quite difficult and rarely done. Thus, the computer will follow “the letter of the law” to the extreme, sometimes with disastrous consequences.
Is this creativity as the paper claims? Judge for yourself.
We certainly call lawyers creative at times.
Evolutionary Algorithms vs Gradient Descent
The paper’s title suggests that evolutionary algorithms are the root cause of the behavior when the actual reason is a mis-specified cost function.
Evolutionary algorithms do not require the cost functions to be differentiable and thus can be more generally applied than stochastic gradient descent.
However, optimizing the cost function with techniques like stochastic gradient descent (the most common optimizer for deep neural networks) will result in similar issues.
OpenAI trained an agent to play the racing game CoastRunners using reinforcement learning.
Instead of progressing through the racetrack, the agent hacked a higher score by looping in a short circle and collecting optional powerups.
This was a classic example of following the law to the letter (maximizing score) while totally violating the spirit (completing the racetrack).
Researchers used an evolutionary algorithm to sort numbers, but the evaluation code merely checked that the algorithm returned any sorted list instead of ensuring that it sorted the numbers that were given.
The agent returned an empty list for every input and got full marks since an empty list is in sorted order.
Imperfect simulators mean they don’t always match the real world.
Computer scientists and physicists once collaborated to find better carbon nanostructures.
An evolutionary algorithm suggested a molecule where all the carbon atoms were stacked in the exact same position in space, something not explicitly ruled out in the simulation but impossible in the real world.
Physicists blamed the computer scientists for producing an impossible configuration, while the computer scientists blamed the physicists for a faulty simulation model.
The cross-discipline collaboration collapsed shortly afterward.
Though these stories seem frivolous in hindsight, it is actually extremely hard to specify a good reward function that isn’t vulnerable to Goodhart’s law.
Applications to AI Safety
Stories above about “toy” simulations gone wrong suddenly become causes for extreme concern if agents deviate significantly from expected behavior in the real world.
In one airplane landing simulation, an evolutionary algorithm applied extremely large forces to the airplane and triggered a bug, causing the simulation to overflow and claim impossibly good results for dangerous behavior.
Containment thus seems pretty important. Never release a black box optimizer directly into the wild without having high confidence in its expected behavior.
- Hugh Zhang, Stanford