Deep Reinforcement Learning: Curiosity driven Super Mario

I have used Deep Reinforcement Learning with Curiosity-driven Exploration (see to train an agent playing Super Mario in the OpenAI GYM for Nintendo NES emulator. The untrained Mario is obviously the one on the left side. The input data for the agent are the raw pixels. The environmental rewards (i.e. the value which the agent tries to maximize) is the game score.
I ran the training in a Docker container based on the latest pytorch/pytorch image with some adaptations for the graphics output. My starting point was the example source code from the MEAP book “Deep Reinforcement Learning in Action” by Alexander Zai and Brandon Brown, which I highly recommend. See and click on “Source Code”. The training took less than an hour on a 4 core i5 @2,9 GHz, 16 GB memory, NO GPU involved. It is a little scary to realize, how far one can get with relatively modest computational resources in such a short training time.

Book Recommendation: Deep Learning and the Game of Go

I have managed to sneak my name on the back cover of one of the coolest books to come in 2019.


Max Pumperla and Kevin Ferguson, Deep Learning and the Game of Go, Manning, published today,  ISBN 9781617295324

I warmly recommend this book to anyone with an interest in deep reinforcement learning. The game of Go is a fantastic use case, that keeps you inspired and on course to the last page.

With a well-thought-out structure and some good humor, Max Pumperla and Kevin Ferguson have masterfully accomplished the task, to write a well readable, comprehensive introduction, that is at no point boring, without leaving out anything important.

You can find two free chapters at

What the German government’s national strategy on Artificial Intelligence should look like

Better late than never: In late July, the German federal government has published a cornerstone paper on a “strategy artificial intelligence”. The details of the strategy will be worked out until November and presented at the Digital Summit in December.

There will be a lot to do after the summer pause. The paper, in it’s current state, is mostly generic, and very little commitment shines through. When you read it to the end, you’ll find a silver lining in the very last section “Immediate measures of the Federal Government” (my translation):

Keeping and retaining AI experts in Germany has immediate priority across programs and policies. Networking and expansion of competence centers with France will be implemented without delay.

If only these two bullet points translate into tangible action, the paper would be worth the paper. Yet is does not define a strategy. The paper as a whole looks more like a concatenation of brainstorming items with no clear direction.

Some of the cornerstones seem to be unrelated to the topic of AI. For example the idea to invest in infrastructure. Sounds good, but what does it mean in the context of AI? Building a national GPU cloud?

So let’s see, if we can do better. Let’s define a better AI strategy for Germany.

A better AI strategy for Germany

Global context

First, let’s look at the global context, because it makes no sense to position ourselves when we don’t know where everybody else is standing.


When AlphaGo beat Lee Sedol in 2016, we felt with the Korean people, who had to witness a national idol losing against DeepMind’s machine. What I didn’t realize then was, that for China, this was nothing short of a Sputnik shock. An ancient Chinese game that requires intellectual brilliance, strategic planning, experience and intuition to win, was suddenly won by a British (of all nations) team of a few young people and their computer. It was apparent, that a technology with these powers can not only be used for playing games. And this came at a time when the West was more then ever reluctant to share advanced technology with China.

So Beijing committed itself to become the global AI leader by 2030. China has earmarked hundreds of billions of dollars for collaboration with its existing tech leaders and to encourage the rise of unicorn startups. Last year, the State Council’s release of a national strategy for AI development channeled and focused existing initiatives and made them a national priority.

Have a look at FHI institute’s paper Deciphering China’s AI Dream for more.


Today, the United States are by all means leading in the field of artificial intelligence research, and they also have introduced the most policy reports on AI strategies. It is difficult to say, if the current federal administration is willing to implement any of these strategies, and if so, how long the political climate will allow researchers and engineers to move forward at a meaningful speed.

From the outside it looks like the administration prefers leaving the field mostly to the private sector. When it comes to the industrial application of AI, this corresponds well with the overall political direction. AI innovations can play a key role in concert with other policies. For example: After pulling back from globalization and pushing a part of the workforce out of the country, AI will have to play a major role in replacing manual labor.

But with a few exceptions, automation is not the area of artificial intelligence, that American innovators are best at. The borders of the field are pushed by companies that will keep targeting the global market with disruptive services, mostly for end users. The fact that these companies have their headquarters in the United States does not automatically create a competitive advantage for the nation, beyond increased tax revenues.

At the end of the day, the ability or inability of Washington to implement a concise strategy might not even matter, because the United States have other federal bodies, that operate with relative independence and have proven in the past that they can efficiently help out in in these situations. DARPA with it’s annual 3 billion dollar budget and great track record on strategic investments, is an outstanding example.  Less visible but probably not less effective are the efforts of the intelligence agencies to utilize AI for their specific needs.

European Union

In 2013, the European Union proposed the 10-year Human Brain Project, which is still the most important human brain research project in the world.

A year earlier, in 2012, the European Commission decided to initiate a Public-Private Partnership in Robotics, later named SPARC. Driven by an aging population and little access to cheap labor, manufacturing companies in the EU traditionally are under more pressure to automatize, than companies in China and the US. Hence robotics is taken very seriously in the EU.


Japan has an even sharper focus on robotics, mostly for the same reasons as the EU. It has, with a margin, the most robot users, robotics equipment, and service manufacturers in the world.


In March, Emmanuel Macron outlined France’s national strategy for artificial intelligence. The government will spend 1.5 billion euros over five years to support AI research, encourage startups, and collect data.

In a Wired interview, Macron discussed the reasons behind the initiative. While you can read some fear of missing out between the lines, one central goal seems to be defending European values and the way we live. When a technology shapes every aspect of our lives as AI does, it’s best to be involved in shaping the standards that govern this technology.

Paris is already an AI Hub with labs of some of the biggest players.  With regained self confidence after winning the soccer world cup, and under considerate, sober-minded political leadership, it looks like France will enter the club of global AI leaders rather sooner then later. Either with or without the rest of the EU.

United Kingdom

With the chaos surrounding the Brexit, it is hard to say if the UK will be able to execute consistently on a national AI strategy, or any strategy at all. But it certainly has enormous potential. DeepMind, Swiftkey and Babylon all started in the UK.

In April, the UK government has presented something like a national AI strategy in a quite detailed policy paper called the “AI Sector Deal“. The point of this deal is, to establish a strong partnership between business, academia and government. The objective is rather bold:

 A revolution in AI technology is already emerging. If we act now, we can lead it from the front. But if we ‘wait and see’ other countries will seize the advantage. Together, we can make the UK a global leader in this technology that will change all our lives.

Also the goals do not exude unnecessary modesty:

  • AI and Data Economy – We will put the UK at the forefront of the artificial intelligence and data revolution
  • Future of Mobility – We will become a world leader in the way people, goods and services move
  • Clean Growth – We will maximise the advantages for UK industry from the global shift to clean growth
  • Ageing Society – We will harness the power of innovation to help meet the needs of an ageing society


In June, an Indian government think tank has presented the countries AI strategy in the form of a discussion paper. Among other things, the paper discusses the possibility to use AI for social inclusion and to position India as an AI hub for the developing world.



Sophia, Saudi Arabia’s first robot citizen, gave a speech at the pre-opening of the Munich Security Conference earlier this year.

Many other countries are in the process of implementing their AI strategies with full steam. The UAE have a Ministry for AI, Saudi Arabia has at least one robotic citizen. A good and regularly updated overview on national AI strategies can be found in Tim Duttons Blog.


Timing, Pace and Direction

Germany is a little late in this.

Here is the good news: Consolidation has not even begun. The development is so fast, that it does not matter much, if you are behind today. We are currently in the qualifying phase, where nations, blocs and organizations compete for the Pole Position in the much more important race for the best utilization of AI. The price of this is a short but abundant economic and political dominance, that will be used by the winner to shape societies and the global political landscape for many years. This race will be won by the region that want’s it most. And currently this seems to be China.

But not all contestants in this race run in the same direction, so it is unclear, if we even have a race and what the criteria for winning are. While the Silicon Valley has a focus on creating science fiction technology to create new markets and disrupt others, China will work on using AI for efficiency gains to stay competitive in their existing markets and probably on intelligence and military applications to keep their trade routes open. Sub-Saharan Africa on the other hand will keep looking for innovative ways to provide public services without first building up the expensive 19th century infrastructure, which serves as a basis for these services in other countries.

It makes sense that every region focuses on solving their particular problems first. While it seems inevitable, that countries and blocs compete in building up AI capabilities, they don’t necessarily need to to compete in the development of specific technologies or certain applications of AI. In the past, learning from each other has mostly been a good practice with new technologies. With AI comes a new twist: now even our systems can learn from each other.  And they will, even if we don’t want it. There is currently no feasible IP protection for machine behavior, for the same reason that there is no way to stop monkeys from copying each others behavior. Nonetheless an implicit protection exists: When an AI system solves a problem that others don’t have, there is little incentive to copy it. There is also indirect protection: When my AI system finds the formula for an active pharmaceutical component faster then your AI system, I can protect this result with the established procedure.

Geopolitical situation

One consequence of America’s shifted priorities is, that the western world as a whole is without direction, and so far, it is doing a terrible job in finding a new leader.  It also has become clear, that China is neither willing nor able to assume responsibility for the world order as fast, as the United States is giving it up. The western hemisphere and to some extend also the rest of the world needs a replacement for America’s leadership. Germany has always been particularly vulnerable to geopolitical chaos.  While meandering in the growing maze of political fragmentation, Germany at least needs to coordinate new policies and strategies with it’s neighbors, to create some coherence and stability. Beyond that, in order to find a new order for the West, we need to develop the ability to give trusted neighbors the lead in important topics.

Better Cornerstones

With the groundwork in place, let’s now see what the cornerstones of a German AI strategy should be:

  1. European cooperation, especially with France
  2. Quantum AI
  3. Edge computing
  4. Empowering the individual
  5. Grand Challenges

European Cooperation, especially with France

Cooperation with France was already in the original paper, and there seems to be some progress. It is by far the most important point for the reason, that president Macron has mentioned in the Wired interview: To defend European values we need to participate in setting the standards, and this can not be done with regulations in an effective manner (otherwise our public spirit would be in much better shape, for we have no shortage of regulations). It must be done by shaping the technology, establishing facts by creating useful systems and promoting them, so that they become de-facto standards when others start using them.

This concerns our freedom and the way we live. It is absolutely essential that Europe is united in this because no single European nation, not even France, is even remotely on par with China and the United States at this time. If France goes ahead in this quest, Germany should make it it’s top priority to provide every possible support.

Quantum AI

The second largest strategic mistake we can make, is a not so obvious one: Not providing  students at technical universities with access to quantum computers. The match of quantum computing’s opportunities to AI’s problems is so good, that as soon as “quantum supremacy” is reached, it will have an even greater impact on AI then on cryptography (and the impact on cryptography is expected to be drastic). Also the impact will come sooner, because for AI applications we don’t have to wait for the quantum computing community to figure out error correction to a quasi deterministic level, as we have it in classical computers today. For cryptoanalysis this is essential. For neural network training it is not.

I did not read anything about quantum computing in the cornerstone paper at all. A national strategy should at least consider the opportunities that might open up for AI; especially in a country that is left behind in supercomputing, but is at the same time is the home of Werner Heisenberg and Max Planck.

Edge Computing

Even without AI, Edge Computing (beyond Industry 4.0 scenarios) should be a national priority for rather profane reasons like poor internet connectivity and expensive data plans. Add AI and data driven business models to the picture, and it becomes clear, that Edge Computing solves a whole pile of problems that are specific to Germany. To pick out the most obvious one: Strict privacy regulations make it hard for businesses (and impossible for small businesses) to offer cloud based data driven services. But when personal data stays on premise at all times (because the relevant data processing happens on the consumers site), a whole new world of innovative services becomes possible, without subjecting the people, who offer these services, to the prospect of draconian punishments.

Moving AI workloads from the cloud to the edge introduces changes that needs special consideration.

  • Training deep neural networks can require a lot of computing power. Moving high performance computing capabilities to the edge would be a waste of resources, if they are only fully used for short peak loads. Research and product development that works towards a smooth utilization of edge resources should be supported. A priority should be use cases, where either a high but even load is put on edge nodes naturally, for example deep reinforcement learning with live data, or where neighboring nodes can sell idle resources to nodes that temporarily need stronger capabilities, for example based on IOTA.
  • Machine learning can be power hungry too. Moving workloads out of centralized data centers closer to the data works well together with decentralized electricity production from renewable sources. It reduces the amount of electricity that needs to be transported to the industry hotspots.

After years of shifting all relevant computing into the cloud, Edge Computing is a paradigm change. There are many obstacles to overcome, but most are technical and will be solved fast as soon as people start seriously working on it. Talent and money seem to be in place for this to happen, but it will lose momentum fast, when it turns out, that a blurry ambiguous legal framework puts the protagonists at risk.  To make Edge Computing happen, the German government has to make sure with clear, concise, reality aware regulations, that misguided jurisdictional aberrations are kept in check, so they don’t keep end users from using Edge devices.

Empowering the individual

eIoXsRujKQtLAzTiqfGXMGi5CZ_dJzhWBj7K-gQCtlQpX92IBThe perspective taken by the cornerstone paper is very much top-down. It talks very little about empowering people. Of cause, it is crucial to attract an elite and create an ideal environment for them to work in. But we should not stop there. The field of AI is vast and widely unexplored, with plenty of room for surprises. People here tend to have a broad and solid education, even those without a Data Science PhD. This is a great resource that we should tap into, if we want to get ahead. When Germany promotes people science, when German companies encourage their employees to use the available tools and implement AI solutions within their own realm of expertise, with their existing data, to solve their own local problems, then we will very soon have a broad adaptation of AI technology made in Germany through all industries and areas of society.

If we don’t do that, most of these solutions will come years later (when one of the few experts finally has time), or organizations fall back to standard products or cloud solutions, that won’t give Germany any competitive advantage.

Grand Challenges

Grand challenges define the areas that we want to push forward with special rigor. They represent the most pressing problems we hope to solve with AI technology. We are looking for the best possible solution and tender high rewards for it. These problems are:

  1. rL-S2wiYPVyjE_MJjB7dxa4JKZS_F0VtmAwpym1WHiUpX92IBAgile Intrusion Detection: Detecting hackers and dangerous software early is important for organizations as well as for individuals in a world where cyber warfare becomes more and more a regular tool of robust diplomacy, and nation states carry out direct attacks against private entities. To be able to protect personal data and trade secrets efficiently, European businesses need intelligent shields that detect and stop complex attacks with near to 100% accuracy. It is a huge design flaw in the GDPR to allow EU member states to just dump the responsibility for this part of cyber defense on the first line of victims in the crossfire of coordinated attacks: those private entities, that happen to work with personal data. Barber’s shops and soccer clubs are not in the business of cyber defense. Nation states are. The states should build and provide the tools that support protecting peoples’s data from attacks of criminals and other nations. These are defensive weapons of modern warfare and it is the responsibility of states to develop and deploy or provide them. Highly accurate intrusion detection and prevention is also an important puzzle piece for making Edge computing a success, because people will rightfully deny investing in devices that put them at risk. Computers on the edge need to be able to defend themselves against unforeseen attacks in a nimble and adaptive way. This can only be solved with an advanced combination of AI techniques and Germany needs the best possible solution, so the German government should make it a top priority to get the best talents in the field to work on this task.
  2. Reliable information and democratic consensus building: Fine granular political campaigns and micro-targeting lead to ever more polarization and radicalization even within homogeneous groups of people. When similar people who should have similar interests are systematically presented different facts and are shielded from other facts, these people diverge from each other in a way that undermines social cohesion and the fabric of democracy. The traditional mass media has proven to be ill equipped to curtail this development. To detect coordinated disinformation campaigns before being sucked into them, people need a tool that is much more personalized than mass media can be. We need easy-to-use instruments for each citizen to quickly check facts and put them into perspective at the moments they are presented to her. 85 years after the introduction of the Volksempfänger, it is time to introduce a device to deflate propaganda. This service needs to be free from commercial interests and political influence. It should operate as automatically as possible, but needs an independent controlling body to keep machine bias in check. And first of all, it needs to be created. That is what the second grand challenge is about.
  3. Next Generation Personal Agent: It is already becoming difficult to imagine a world without smart personal assistants like Amazon Alexa™, Google Assistant™, Microsoft Cortana™. They are extremely useful in organizing daily private life and are becoming better daily. They are also a picture book example of the principle agent problem.  When these agents are asked to perform an action that contains a conflict of interest between the owner and the service provider (i.e. Amazon, Google, Microsoft), it will tend to act in a way that resolves the conflict in favor of the service provider. We need a device that offers similar services as smart assistants, but is able to learn to make decisions in favor of the owner. Since this device needs to adapt to the owner much better than existent smart assistants, the learning should be less centralized than it is in current solutions. Ideally the device should be able to use it’s own computing capabilities for the training process. This also allows to use sensitive private and personal data for the training, without sending it to external service providers. The goal should be to have an electronic personal agent that the owner trusts enough that she does not need to control it’s actions.


Building the Reinforcement Learning Framework

To build our reinforcement learning framework, we are going to follow the basic recipe laid out in the February 2015 Nature article “Human-level control through deep reinforcement learning” (

Reinforcement learning has been shown to reach human and superhuman performance in board games and video games. Transferring the methods and experiences from this to the use case of trading goods or securities seems promising, because it has many similar characteristics:

  • interaction with an environment that represents certain aspects of the real world,
  • a limited set of actions to interact with this environment,
  • a well-defined success measure (called “reward”),
  • past actions determine the future rewards,
  • a finite, semi structured definition of the state of the environment,
  • unfeasibility of directly determining the future outcome of an action due to a prohibitively large decision tree, incomplete information and missing knowledge about the interaction between the influencing factors.

Our inference engine is going to be Deeplearning4J (DL4J, see The DL4J website contains a very brief and well written introduction to reinforcement learning, which I highly recommend, if you are not familiar with the concept yet:

The first step in implementing a RL framework for Bitcoin trading is, to map the conceptual elements of the process to our use case:

  • Action
  • Actor / Agent
  • Environment
  • State
  • Reward


An action is a distinct operation with a direct impact on the state of the actor and the environment. In the case of game playing, placing a tile on a specific field on the board or moving a joystick in a certain direction are examples of actions. In the case of Bitcoin trading, the obvious actions are placing and cancelling orders to buy or sell certain amounts of Bitcoin at a given cryptocurrency exchange.

A smaller set of actions improves the learning speed. For optimal performance we will restrain our action set to only three possible actions for now:

  1. Cancel all open sell orders and place a buy order at the last market price using 10% of the available USD assets.
  2. Cancel all open buy orders and place a sell order at the last market price using 10% of the available Bitcoin assets.
  3. Hold (do nothing).

In a later version we will likely extend this to

  • have cancelling and placing orders as distinct actions,
  • a larger variety of amounts (other than “10% of available assets”) to use for buy and sell orders,
  • different limits, above and below the last market price.

But for gaining experience, and to go easy on our computational resources, we are going to keep it simple for now.


The actor is our trading bot, which is using the Bitstamp-API to place and cancel orders. We are going to reuse existing Java code from the old trading system for this.


Since we don’t want to reinvent the data collection and we already have collected several years worth of training data, the environment is given by all the data sources that we have defined in the old trading system. (


The current state of the environment is the input for the inference machine. We can reuse the format that we have used for the old Bitcoin prediction system for this ( It has some issues that we might address later, but we don’t want to reinvent the wheel so we stick to it for now.


We have two possible ways to define the reward:

  • After each executed sell order: the difference between the sell price and the previous average buy prices of the sold Bitcoins, minus transaction costs
  • In each step: the difference of the current net value (USD +BTC) and the net value in the previous step.

The first option compares better to the game analogy and also takes advantage of one of the key features of reinforcement learning : assessing future outcomes of current actions), but the second option promises faster convergence, so to begin, we choose the second option.


Transition to Reinforcement Learning

Our current goal is to introduce Reinforcement Learning into the decision making component of an existing Bitcoin trading system. To put this into a broader context: What we are about to do is — in the flowery terms of business speak — the upgrade from “Predictive Analytics” to “Prescriptive Analytics”. Please google for “Analytic Value Escalator” for a 30000 ft. view, and think a minute about this: Is the question “What will happen” really harder to answer then “Why did it happen”? (In data science you have to take market research serious, even if it hurts.)

Walking up a step on this escalator, with the task at hand, we are interested in going from the question “What will happen?” to “How can we make it happen?”. Again, I am not completely convinced, that the latter one is more difficult to answer, but it requires us to make a big architectural change in the trading system.

To understand this, let’s have a look on the training pipeline that we have used so far.

Old Architecture: Separation of Concerns


We had three servers involved:

  • A build server running Jenkins, which is used for multiple projects and not of particular interest for us here and now.
  • A server running the trading software, called “Trade Hardware”, executing a stack of shell scripts and Java programs.
  • A powerful machine for computationally intense tasks implemented in Matlab and Java, called “Compute Hardware”.

Here is what happens at the blue numbered circles:

  1. Once a week the build server triggered a training session. The reason for regular re-training is, that we wanted to have current trends in the market behavior to be reflected in our prediction model. Also, each week we had aggregated significantly more training data. More training data promised better prediction results.
  2. Input variables are collected and normalized for neural network training.
  3. Target variables are calculated for training. We have used a binary classifier with 19 output variables that predicted different events like this: “The BTC/USD rate will go up by 2% in the next 20 minutes”.
  4. To reduce the size of the input, a PCA was performed and only the strongest factors were used as input variables. The PCA Eigenvectors and the normalization factors from step 3 are stored for later, to transform raw input data in production to a format consistent with the training input.
  5. The previous neural network model is loaded to be used as initial model for training.
  6. The training is run in Matlab. We don’t need to dive deeper into this topic, because in the new architecture, we will use Deeplearning4J instead of Matlab for the training step.
  7. The new trained model is stored.
  8. The new model is tested in an extensive trade simulation.
  9. The trading software is restarted so it uses the updated model.
  10. Normal trading goes on for the rest of the week.

New architecture: Tight Integration

This pipeline has been built around the concept of a strict separation between trading execution and prediction. The prediction algorithm was part of a decision making module, which itself was just a plugin module of the trading software which could be replaced by another implementation that encodes another strategy. This was actually used to assess simulation results: To determine a baseline performance the decision making component in the simulation has been replaced by one that follows a random trading strategy.

With the transition to reinforcement learning, this strict separation goes away. The learning agent learns from the interaction with the environment, so it completely assumes the role of the decision making component.  From a system design perspective, this makes our life much easier, because many hard questions in the decision making component are now covered by machine learning:  How to interpret the predictions? Where to set thresholds for the output variables? What percentage of available assets to use in a trade? The reinforcement learning agent produces a finished decision that can be directly converted into a buy- or sell-order.

Also the agent does not stop learning once it is in production. The learning is a permanent background process, that takes place during trading. This means, that after the initial training phase, we can retire the “Compute Hardware”, because there is no necessity for weekly retraining.

All this looks lean and efficient at first glance, but it will create problems down the road:

  • The tight integration between machine learning and business logic results in a monolithic architecture,  which will be hard to maintain.
  • The interface between data science and software development has been largely inflated. In the old architecture, the responsibility of data science ended with the prediction, and software development could take on from there. Both groups worked strictly within the bounds of their traditional realms, each with their established tools and processes, which have very little in common, other than the fact that they mostly run on computers. The new architecture leads to a huge overlap of responsibilities, which will require new tools, a common language, mutual understanding and a lot of patience with each other.


Even without looking at the specifics of Reinforcement Learning, we already see, that the system design will become much simpler. The machine learning subsystem assumes more responsibility, leaving less moving parts in the rest of the system.

The project management on the other hand might turn out to be challenging, because the different worlds of data science and software development need to work much closer together.

Further Reading

Deep Reinforcement Learning for Bitcoin trading

It’s been more than a year, since the last entry regarding automated Bitcoin trading has been published here. The series was supposed to cover a project, in which we have used deep learning to predict Bitcoin exchange rates for fun and profit.

We have developed the system in 2014 and operated it all through the year 2015. It has performed very well during the first 3 quarters of 2015, … and terribly during the last quarter. At the end of the year we have stopped it. Despite serious losses during the last three months, it can still be considered a solid overall success.

I have never finished the series, but recently we have deployed a new version, which includes some major changes, that hopefully will turn out to be improvements:

  • We use Reinforcement Learning, following DeepMind’s basic recipe (Deep Q-learning with Experience Replay) from the iconic Atari article in Nature magazine. This eliminates the separation of prediction and trading as distinct processes. The inference component directly creates a buy/sell decision instead of just a prediction. Furthermore the new approach eliminates the separation of training and production (after an initial training phase). The neural network is trained continuously on the trading machine. No more downtime is needed for re-training once a week, and no separate compute hardware is lying idle with nothing to do for the other six days of the week.
  • We use Deeplearning4J (DL4J) instead of Matlab code for the training of the neural network. DL4J is a Java framework for defining, training and executing machine learning models. It integrates nicely with the trading code, which is written in Java.

This will change the course of this blog. Instead of finishing the report on what we have done in 2014, I am now planning to write about the new system. It turns out, that most of the code we have looked at so far, is also in the new system, so we can just continue where we left off a year ago.