A physics approach to classical and quantum machine learning Alexey Melnikov
Institute for Theoretical Physics, University of Innsbruck Institute for Quantum Optics and Quantum Information
Supervisor: Hans J. Briegel Co-supervisors: Justus Piater and Gerhard Kirchmair Jointly with Adi Makmal, Vedran Dunjko and Nicolai Friis MIP Seminar April 15, 2015 Alexey Melnikov
A physics approach to machine learning
Interplay between quantum information theory and concepts from AI Quantum physics Artificial intelligence (AI) Quantum computing Quantum error correction Quantum walks
PS model
Intelligent agent Machine learning
PS - projective simulation Alexey Melnikov
A physics approach to machine learning
Outline
◦ Introduction – artificial intelligence (AI) and its applications – projective simulation (PS) model, a physical approach to AI ◦ Standard (classical) PS agent – benchmarking (grid-world and mountain-car problems) – generalization within PS Model ◦ Quantum PS agent – implementation of a quantum agent – superconductiong transmon qubits
Alexey Melnikov
A physics approach to machine learning
Artificial intelligence (AI) and intelligent agents AI is the study of agents that receive percepts from the environment and perform actions.* Any AI program is called intelligent agent.
Environment
Intelligent agent percepts
actions
* S. Russell and P. Norvig. Artificial intelligence: A Modern Approach, 3rd edition (Prentice Hall, 2009).
Alexey Melnikov
A physics approach to machine learning
AI in robotics A robotic agent might have microphones, cameras, touch sensors and various motors for actuators.*
Environment Robot microphones cameras, touch
Applications: • robotics • finance • games • Google
motors, voice
• QEC • ...
* S. Russell and P. Norvig. Artificial intelligence: A Modern Approach, 3rd edition (Prentice Hall, 2009).
Alexey Melnikov
A physics approach to machine learning
AI in finance A trading agent perceives market rates, news and trades in stock market. A robotic agent Applications:
Stock market
Trading agent rates, news
• robotics • finance • games
trades
• Google • QEC • ...
* S. Russell and P. Norvig. Artificial intelligence: A Modern Approach, 3rd edition (Prentice Hall, 2009).
Alexey Melnikov
A physics approach to machine learning
AI in games A game agent plays with you. A robotic agent
You
Game agent your moves
Applications: • robotics • finance • games
it’s own moves
• Google • QEC • ...
* S. Russell and P. Norvig. Artificial intelligence: A Modern Approach, 3rd edition (Prentice Hall, 2009).
Alexey Melnikov
A physics approach to machine learning
AI on the web Search engine interacts with a user. Google
User
Google query
Applications: • robotics • finance • games
web page
• Google • QEC • ...
* S. Russell and P. Norvig. Artificial intelligence: A Modern Approach, 3rd edition (Prentice Hall, 2009).
Alexey Melnikov
A physics approach to machine learning
AI in quantum error correction (QEC) AI can be useful for quantum physics. A QEC agent gets data from syndrome measurements and performs error correction.*
Quantum register QEC agent syndrome data
Applications:
• robotics • finance • games
apply unitaries
• Google • QEC • ...
* J. Combes, et al., arXiv:1405.5656 (2014).
Alexey Melnikov
A physics approach to machine learning
Projective simulation (PS) agent • PS model is a novel physical approach to AI • PS agent process information stochastically in a directed, weighted network of clips (units of memory) • No computations, simple adjustment rules • Natural candidate for quantization, using methods of quantum walks Clip network ...
PS agent
...
percepts
p41 percept clip Clip 1
Clip 4
action clip
p13 Clip 3
actions
input
p12
p23 p32 Clip 2
p35
Clip 6
p56 Clip 5
output
H. J. Briegel and G. De las Cuevas, Scientic reports 2 (2012). Alexey Melnikov
A physics approach to machine learning
Projective simulation (PS) model Each edge connects some clip ci with a clip cj and has a time-dependent weight h(t) (ci , cj ). The h-values represent the unnormalized strength of the edges, and determine the hopping probabilities from clip ci to clip cj according to h(t) (ci , cj ) . p (t) (cj |ci ) = P (t) k h (ci , ck ) h-values are updated according to h(t+1) (ci , cj ) = h(t) (ci , cj ) − γ(h(t) (ci , cj ) − 1) + g (t) (ci , cj )λ, where 0 ≤ γ ≤ 1 is a damping parameter and λ is a non-negative reward given by the environment. Each time an edge is visited, the corresponding g -value is set to 1, following which it is decreased after each time step with a rate η: g (t+1) (ci , cj ) = g (t) (ci , cj )(1 − η). J. Mautner, A. Makmal, D. Manzano, M. Tiersch, and H. J. Briegel, New Generation Computing 33 (2015) Alexey Melnikov
A physics approach to machine learning
Grid-world task • The agent always starts from the (1,3) cell • It can choose among four actions: left, right, up or down • If the agent decides to go to a square labeled as “wall” or to go beyond the grid, then no movement is performed but the time step is counted The grid-world task: The goal of the game is to find the “star”.
• Reward of λ = 1 is received only after reaching the goal • A performance of an agent in this task is evaluated by the number of steps it makes before reaching the goal at each trial
R. S. Sutton, Proc. of the 7th International Conference on Machine Learning (1990) Alexey Melnikov
A physics approach to machine learning
PS network construction
x =1 y =1
x =1 y =2
...
x =6 y =9
⇑
⇒
⇓
hij gij ⇐
Alexey Melnikov
A physics approach to machine learning
PS network construction
x =1 y =1
x =1 y =2
...
x =6 y =9
⇑
⇒
⇓
hij gij ⇐
Alexey Melnikov
A physics approach to machine learning
PS network construction
x =1 y =1
x =1 y =2
...
x =6 y =9
⇑
⇒
⇓
hij gij ⇐
Alexey Melnikov
A physics approach to machine learning
PS in the grid-world task. Learning curves Η=0.03
average number of steps
24
Η=0.12 22
Η=0.15
20 18 16 14 0
50
100
150
200
trials
The learning curves of the PS agent in the grid-world task, with different η values. A trade-off is observed between the best performance and the number of trials required to reach it. Model
# of steps to goal after 100 trials
Parameters
PS† PI∗
15.4 14
λ = 1, η = 0.12, γ = 0 β = 0.1, γ = 0.9, α = 1000
Performance of the PS model in comparison with the PI model †
A. A. Melnikov, A. Makmal, and H. J. Briegel, Artificial Intelligence Research 3 (2014) * R. S. Sutton, Proc. of the 7th International Conference on Machine Learning (1990) Alexey Melnikov
A physics approach to machine learning
Mountain car problem • The agent always starts with a random position and velocity: x ∈ [−1.2, 0.5], v ∈ [−0.7, 0.7] • It can choose among 3 actions: forward thrust (to the right), no thrust, and reverse thrust (to the left) The goal is to find the “star” at x = 0.5
• The next state is defined by the equations vnew xnew
= vold + 0.001 ∗ Action − 0.0025 cos(3xold ) = xold + vold
• Reward of λ = 1 is received only after reaching the goal • A performance of an agent in this task is evaluated by the number of steps it makes before reaching the goal at each trial S. P. Singh and R. S. Sutton, Machine learning 22, 123 (1996). Alexey Melnikov
A physics approach to machine learning
PS network construction
[x0 , x1 ], [v0 , v1 ]
(x1 , x2 ], [v0 , v1 ]
(x19 , x20 ], (v19 , v20 ]
...
hij gij =
−
Alexey Melnikov
+
A physics approach to machine learning
PS in the mountain car task. Learning curves 450
500
pHtL Hc j Èci L by Eq. 2 HsoftmaxL, Η=0.02
400
æ
300
200
400 à
10
15
20
æ æ à
æ à
æ
300
à
æ
æ æ
à
æ
æ
æ
æ
à
æ æ
200
æ
æ
æ
æ à à
à
150
à
æ
à
250
à à
à
à à
5
æ æ
pHtL Hc j Èci L by Eq. 2 HsoftmaxL
à
æ
350
100 0.00
100 0
pHtL Hc j Èci L by Eq. 1
æ
average number of steps
average number of steps
pHtL Hc j Èci L by Eq. 1, Η=0.02
à
à
à
0.02
à
à
à
à
à
à
0.04
0.06
0.08
0.10
Η parameter
trials
(a) PS learning curves are shown for optimal values of η = 0.02 (for 20 trials).
(b) The dependence of the PS performance on the η parameter is shown after 20 trials.
Model
# of steps to goal after 100 trials
Parameters
PS† SARSA∗
223/trial 450/trial
λ = 1, η = 0.02, γ = 0 5 grids, each of 9 by 9 input space
Performance of the PS model in comparison with the SARSA algorithm †
A. A. Melnikov, A. Makmal, and H. J. Briegel, Artificial Intelligence Research 3 (2014) * S. P. Singh and R. S. Sutton, Machine learning 22, 123 (1996) Alexey Melnikov
A physics approach to machine learning
Generalization. Motivation There are many tasks in which percepts are composed of several elements. Even if two percept clips are different they may contain some common set of elements. This common set of elements should be taken into account in order to share the experience between different inputs. Useful generalization *: ⇐
⇒
⇒
• An ability for categorization (recognizing that all red signals have a common property, which we can refer to as redness)
⇐
• An ability to classify +
−
• Relevant learned
While driving the agent sees a traffic light with an arrow sign and should choose among two actions: continue driving (+) or stop a car (−).
generalizations
should
be
• Correct actions should be associated with relevant generalized properties • The generalization mechanism should be flexible
* A. A. Melnikov, A. Makmal, and H. J. Briegel, arXiv:1504.02247 (2015). Alexey Melnikov
A physics approach to machine learning
Mechanism of generalization ⇐
⇒
⇒
⇒
⇐
⇐
⇐
⇒
⇒
⇒
⇐
#
−
⇒
⇒
⇒
⇐
−
+
(a) t ≤ 1000
(b) 1000 < t ≤ 2000
⇐
⇐
#
+
(a) (1 ≤ t ≤ 1000), the agent is rewarded for stopping at red light and for driving at green light
#
+
⇐
⇐
⇒
⇒
⇒
⇐
⇐
#
−
(c) 2000 < t ≤ 3000
+
−
(d) 3000 < t ≤ 4000
Alexey Melnikov
(b) (1000 < t ≤ 2000), the agent is rewarded for doing the opposite (c) (2000 < t ≤ 3000), the agent should only follow the arrows (d) (3000 < t ≤ 4000), the environment rewards the agent whenever it chooses to drive
A physics approach to machine learning
Mechanism of generalization (a) (1 ≤ t ≤ 1000), the agent is rewarded for stopping at red light and for driving at green light
1.0
efficiency Et
0.8
(a)
(b)
(c)
(d)
0.6
(b) (1000 < t ≤ 2000), the agent is rewarded for doing the opposite
0.4 0.2 0.0
0
1000
2000
3000
time step
The performance of the PS agent with generalization
4000
(c) (2000 < t ≤ 3000), the agent should only follow the arrows (d) (3000 < t ≤ 4000), the environment rewards the agent whenever it chooses to drive
A. A. Melnikov, A. Makmal, and H. J. Briegel, arXiv:1504.02247 (2015).
Alexey Melnikov
A physics approach to machine learning
Quantum PS agent • PS model is a novel physical approach to AI • PS agent process information stochastically in a directed, weighted network of clips (units of memory) • No computations, simple adjustment rules • Natural candidate for quantization, using methods of quantum walks Quantum clip network classical percepts
...
quantum PS agent
...
p41 percept clip Clip 1
classical actions
Clip 4
action clip
p13 Clip 3
classical input
p12
p23 p32 Clip 2
p35
Clip 6
p56 Clip 5
classical output
*G. D. Paparo, V. Dunjko, A. Makmal, M. A. Martin-Delgado, and H. J. Briegel, Phys. Rev. X 4, 031002 (2014). Alexey Melnikov
A physics approach to machine learning
Quantum PS agent Classical random walk on a network with N clips is characterised by a transition T matrix P, where each clip is a vector ci = [0, . . . , 1, 0, . . . , 0] with unity on the i-th position N X P ci = pij cj , j=1
In the quantum case each clip is a state |ci i. However a single unitary cannot encode the P matrix.
We use the set of N unitaries for a quantum walk Ui |0i =
N X √
pij |cj i .
j=1
Two-qubit probability unitaries for PS network with 4 clips.
V. Dunjko, N. Friis, and H. J. Briegel, New J. Phys. 17 (2015) Alexey Melnikov
A physics approach to machine learning
Nested coherent controlization
Three-qubit probability unitaries for PS network with 8 clips.
Alexey Melnikov
No-go theorem. Additional degrees of freedom are needed.
A physics approach to machine learning
Transmon qubits An aluminum transmon qubit with the dipole antenna is mounted at the center of the cavity.
For one qubit, the system is described by the Hamiltonian 2 2 H/~ = ωr a† a + ωq b † b − χqr /2 a† ab † b − χrr /2 a† a − χqq /2 b † b , where a and b are the dressed mode operators of the resonator and the qubit, respectively, ωr and ωq are their frequencies, χqr is the coupling between them and χrr , χqq are the anharmonicities. H. Paik, et al., Phys. Rev. Lett. 107 (2011). B. Vlastakis, et al., Science 342 (2013). Alexey Melnikov
A physics approach to machine learning
Coherent controlization using transmon qubits We use the cavity as a additional degree of freedom to implement the coherent controlization. a b
c
The resonance frequency of the cavity depends on the state of the qubits. For two superconducting qubits, we may hence label these ω00 , ω01 , ω10 , ω11 , corresponding to the two-qubit states |00i, |01i, |10i, and |11i, respectively. Alexey Melnikov
A physics approach to machine learning
Coherent controlization using transmon qubits
6 4 2 0 2 4 6 6 4 2 0 2 4 6
1
0
1 6
4
2 0
2
4
6
6
4
2 0
2
4
6
Alexey Melnikov
6
4
2 0
2
4
6
6
4
2 0
2
4
6
6
4
A physics approach to machine learning
2 0
2
4
6
Conclusion
◦ Standard (classical) PS agent – is a competitive AI model (grid-world and mountain-car problems) – generalization mechanism improves the model – has potentially many applications ◦ Quantum PS agent – quantization, using known methods of quantum walks – implementation using superconductiong qubits
Thank you for your attention!
Alexey Melnikov
A physics approach to machine learning
Conclusion
◦ Standard (classical) PS agent – is a competitive AI model (grid-world and mountain-car problems) – generalization mechanism improves the model – has potentially many applications ◦ Quantum PS agent – quantization, using known methods of quantum walks – implementation using superconductiong qubits
Thank you for your attention!
Alexey Melnikov
A physics approach to machine learning