Why Neural Networks work

One popular explanation of the fact, that artificial neural networks can do what they can do, goes along these lines:

  1. A brain is capable to do these things.
  2. An artificial neural network is a simulation of a brain.
  3. Therefor an artificial neural network can do these things, too.

20160222_091412.jpgAdmittedly, most things in computer science that “work” in the sense, that they produce useful output for the real world, are implementations of theoretical models that have been build by other sciences, so this is kind of a valid explanation. I don’t like it anyway.

My first problem with it is this: it’s not quite true, that an artificial neural network (ANN) is a simulation of a brain. To be fair, some  come impressively close. But in the context of this blog we unambitiously restrain ourselves to the level of sophistication that we find in most real world ANNs, which are radical (radical!) simplifications of even the simplest natural neural networks.

Second: it does not help most people to understand, why the ANN is capable of doing useful work. Unless you already understand the brain, it won’t help you much, when I tell you, how we are going to map gray matter to mathematical concepts. (And if you already understand the brain: Welcome to the blog. You can skip the rest of this post, if you want.)

I want to come from the other side, and approach the topic as an engineering problem. Buckle up. We are going to manufacture a special purpose classification machine, and then (in a later post) we will generalize it and see if the result has any similarity to what the neuro sciences know about the brain.

As a basic motivation, let’s assume, that your boss has found a webcam that shows a stock market chart (like this), and came up with this brilliant idea: he will become insanely rich with a new software, that reads the chart and outputs some kind of likelihood, that the market is in an upward trend. Your boss calls this likelihood “Zuversicht”, and we are going to stick with this term for a while, because we like German words, and because the corresponding English word (“confidence”) already has a certain meaning in statistics, and we want to prevent confusion resulting from ambiguous terminology.

Ok, now our input is an image from a webcam, so we have a two dimensional array of pixel colors. To make it easier, you convert the image to grayscale, so you only have to think about the pixels’ brightness and ignore hue and saturation. You look at some examples of upwards trends and can’t help but to observe, that the lines tend to start in the lower left corner and zigzag their way to the upper right corner.

btcBlog2x2FilterHeatmapUptrendStrictWithChart

Breaking down the image in quadrants, you notice, that in these cases the average pixel brightness in Q2 and Q3 is higher than in Q1 and Q4. With this insight you write the following lines of code and declare your job done.

double zuversichtUptrend(double[][] image){
  int imgLenth = image.size();
  int imgWidth = image[0].size();
  double averageBrightnessQ1 = Math.avg(Arrays.subarray(image,0,imgLenght/2,0,imgHeight/2);
   ...
  return averageBrightnessQ2 + averageBrightnessQ3  

}

It works great on the test data, your boss is happy and his boss gives him a raise. But a few weeks later, he tells you, that he’s not happy anymore. He has not become insanely rich!

What went wrong?

Apparently, your program has mis-classified the trend on several occasions. So you have a look on the chart images for these days, and see two major flaws of your approach:

btcBlogChartWebcamFilterTooCourse

  1. On some days, the chart went almost flat or turned back to negative. The chart was just low enough in the early hours to run through Q3 and just high enough in the later hours to run mostly through Q2.
  2. On other days the chart went clearly down, but your software’s Zuversicht value was very high. The reason turns out to be, that the overall brightness of the picture was high on those days, illuminating Q2 and Q3 without the chart line covering much space in them.

So you start the second iteration of your engineering endeavor.

To solve problem 1, you obviously need a higher resolution. Let’s try 8×8! This partition conveniently allows us to identify each field with a chessboard notation.

btcBlog8x8FilterHeatmapUptrendStrict

A perfect upward trend, wich your boss defines as a straight line from the lower left to the upper right, will result in the fields A1, B2, …, H8 lit up and the other fields remain dark. The flat chart from problem 1 will rather lite up the fields A4, B4, .., G5, H5. Great, but what about all the other possible charts that show a trend that goes upward in a non steady, somewhat chaotic fashion? This is, after all, rather the norm then the exception.

btcBlogChartWebcamFilter8x8Strict

Let’s add some fuzziness to the system.  The intuition is like this: For each field you guess the probability that the full chart shows an overall upward trend if the particular field is lit. For example, if the lower right corner (H1) is lit, the probability of an overall positive trend is zero. If the field left to it (G1) is lit, the probability is close to zero, but there is still a possibility, that the chart makes a radical upward turn in the remaining 1/8 of the chart. The closer you get to the perfect upward trend, the higher the probability becomes.

btcBlog8x8FilterHeatmapUptrend

You call the resulting 8×8 numbers a “weight matrix”. You can use it as a filter for the actual chart images by doing the following:

  1. For each field of the loaded picture you multiply the actual average brightness with the corresponding value in the probability matrix. The product will be a high value, when the average brightness is high and the value in the probability matrix is high. Otherwise it is a low value. You repeat this step for each field, 64 times altogether
  2. You add up all the  products.

The closer the actual chart zigzags around the ideal chart, the higher the sum will be. But even when the actual chart goes astray: if it remains on a positive trajectory, we will get a relatively high result in this calculation.

So far so good. Let’s  look at the second problem: the tide lifts all boats and the ceiling light lights up all pixels. When someone turns on the light in the trading room, all pixels in the Webcam picture become brighter. Even areas that are not trespassed by the line chart seem brighter, which renders our filtering result worthless.

Let’s add a preprocessing step to fix this.  If there was no line chart in the picture, all pixels would have approximately the same brightness and they were supposed to be black (brightness zero). If in this case, you would subtract the overall average brightness from each fields measured brightness, the result would be all black fields. Subtracting the overall average brightness normalizes the picture to what it is needed for our further processing.

Now add the line chart to your consideration. Because it covers only a very small fraction of the image, it does not change the overall average brightness too much. The light noise that illuminated the dark parts of the image, also made the bright parts (the lines of the chart) brighter. So if we subtract the average overall brightness from the bright pixels, we also normalize those parts of the image to what is expected as an input for the next processing step.

Great, now you know what to do to solve problem 2. Question is: How do you do it. Wouldn’t it be great if you could implement both processing steps in a unified way. In other words: is it possible to define a weight matrix in such a way that when we apply it to the input data, the average overall brightness subtraction of your preprocessing step is executed. Turns out: it is possible.

Imagine the following weight matrix for field A1:

  • Value at position A1: 1-1/64
  • Value at all other positions: -1/64

Please convince yourself, that this Matrix will do the average subtraction for position A1. Of course, this works just as well for all other positions.

btcBlogChartWebcamFilter8x8PixelNormalization

Hmmm, interesting, you just solved two seemingly totally different problems with the same approach. It feels a little odd to define a huge matrix for a calculation that could easily be done procedurally, but you have a feeling, that there might be a systematical advantage in a unified way to tackle problems in this project. Also, of course, you know, that vector (and with it matrix-) calculations are the strongpoint of GPU data processing as well as highly optimized Software packages like Matlab (“Matrix Lab”!) and Octave. You feel that after your initial success, your boss might become greedy, which will ultimately put more load on your software. Having some strong performance afterburners like these in your arsenal, might come out handy later.

Your overall process has three steps now:

  1. You create an 8×8 matrix from the image data as input data layer. (To facilitate vector operations, you “flatten” this matrix to a vector of lenght 64, but that’s an implementation detail).
  2. For each field you apply the corresponding preprocessing 8×8 weight matrix to whole input layer 8×8 matrix. The result is a new 8×8 matrix, which you call the “hidden layer“. (And in the real world, you would do this again with “flattened” vectors and a large 64×64 weight matrix representing all fields. This is mathematically equivalent and can be well parallelized. Again: just an implementation detail)
  3. You apply the classification weight matrix an the hidden layer and get the Zuversicht value as output.

 

There you go: without thinking much about neurons, synapses and ganglia, you have handcrafted your first artificial neural network. Your new software is actually what people call a Feedforward Neural Network with a linear activation function. When you define a threshold value for the Zuversicht output, you also have a binary linear classifier.

Your neural network is still far from being perfect. You will eventually get there, but not today. Lets just mention a few things that you would need to think about before going into production:

  • It is not able to learn! It works because you were able to provide a “model” (that is the weights in the weight matrices). This is good enough for now, but for the future we prefer to let the computer do the work of figuring out the model data.
  • It is not well protected from eccentric input data. Imagine what happens, if a camera error or a data transfer problem produces for a single pixel a value of  325212498434 instead of a value in the expected range between 0 and 1.
  • It will still fail to make your boss immeasurably rich, because it does not predict anything. It only classifies a chart as close enough to your bosses definition of a perfect chart. This is, what he wanted, so it is partly his fault. But we nevertheless can do better.

Even with these shortcomings, you hopefully have built up some comprehension as to how a neural network is able to recognize a pattern. We have seen that, even without actively imitating nature, we get to a similar result, when we just work our way to the best solution in a straightforward manner.

A little heads-up: In the next post, we will build the software to convert the collected Bitcoin price and market data to a format suited as an input data layer for a neural network like this. If your data collector from the previous post is not running yet, please start it soon to have some data to play with next time.

 

 

Advertisements

Wrapping up data collection

btcBlogDataCollectorScreenshot

To finally start with the data collection, you need some framework for your Quote Readers. So this is, what we are going to build next.

We use three external libraries to facilitate this task:

  • Google Gueva V. 19.0 (Apache license 2.0)
  • Cron4J 2.2 by Sauron Software (LGPL license)
  • JFreeChart 1.0.19 by Object Refinery (LGPL license).

Cron4J provides a lightweight scheduler, which we use to repeatedly execute the following process in a controlled manner:

  1. Load data from online resources using a bunch of QuoteReaders.
  2. Write the data to a new line in an output file.
  3. Visualize the data in a line chart.

From the Gueva library we use the EventBus to decouple the main process from the data output to keep technical problems contained. Please note, that the EventBus is marked @Beta in Gueva.

JFreeChart gives us an easy way to output the data as a chart, which is sometimes helpful.

Main

Let’s examine the main method.

/**
 * Main method
 *
 * @param args Optional: 1. chart|nochart, 2. output filename.
 */
public static void main(String[] args) {

 // add a shutdown hook for finishing the output file when the process is
 // stopped
 addShutdownHook();

 // Quote Reader for Bitstamp data
 final QuoteReader rdr = new BitstampQuoteReader();

 // etf tracking the csi300
 final QuoteReader yqrSha = new YahooFinanceQuoteReader(
 "ASHR");
 final QuoteReader yqrCac40 = new YahooFinanceQuoteReader(
 "%5EFCHI");
 final QuoteReader yqrChf = new YahooFinanceQuoteReader(
 "CHFUSD%3DX");
 final QuoteReader yqrZar = new YahooFinanceQuoteReader(
 "ZARUSD%3DX");
 final QuoteReader rqrEur = new YahooFinanceQuoteReader(
 "EURUSD%3DX");
 final QuoteReader yqrRub = new YahooFinanceQuoteReader(
 "RUBUSD%3DX");
 final QuoteReader yqrCny = new YahooFinanceQuoteReader(
 "CNY%3DX");
 final QuoteReader yqrGoldFutures = new YahooFinanceQuoteReader(
 "GCJ16.CMX");
 // SPDR Dow Jones Industrial Average ETF (DIA) reflects the Dow Jones
 // Index.
 // we need to use this, because Yahoo no longer provides the original
 // index
 // as CSV
 final QuoteReader yqrDowJones = new YahooFinanceQuoteReader(
 "DIA");

 // create event bus
 final EventBus eventBus = new EventBus("hsecDataCollectorEventBus");

 // register file writer
 eventBus.register(new OutputFileWriter(getOutputFileName(args)));

 // if paramter "chart" is set, register ChartWriter
 if (checkChartArgument(args))
 eventBus.register(new ChartWriter());

 // -------------------
 // here would be the place to register more event handlers
 // like a database writer, RSS publisher ...
 // -------------------

 final IntHolder failedConsecutiveLoops = new IntHolder();

 // a random offset time for scheduled events for desynchronization of
 // running instances:
 // prevents that all instances of this program access the APIs at the
 // first second of each minute.
 final long offsetTime = Math.round(Math.random() * 59000d);

 Runnable dataCollectorSchedulerTask = new Runnable() {
 public void run() {
   //lets look at this later...
 };

 // Scheduler for data collection
 Scheduler dataCollectorScheduler = new Scheduler();
 dataCollectorScheduler.schedule(SCHEDULE_EVERY_MINUTE,
 dataCollectorSchedulerTask);
 dataCollectorScheduler.start();

 // keep running as long as everything is ok.
 while (keepRunning && failedConsecutiveLoops.i > 100) {
 try {
 Thread.sleep(60000);
 } catch (InterruptedException e) {
 //ignore
 }
 }
 dataCollectorScheduler.stop();
}

Starting in line 16, we initialize the following quote readers:

Symbol Comment
BTC Bitcoin in USD, Quote from Bitstamp.
ASHR An ETF tracking the CSI500, giving us information about the health of Chinese markets.
^FCHI CAC 40 Quote from Yahoo
CHFUSD.X CHF in USD, Quote from Yahoo
ZARUSD.X ZAR in USD, Quote from Yahoo
EURUSD.X EUR in USD, Quote from Yahoo
RUBUSD.X RUB in USD, Quote from Yahoo
CNY.X CNY in USD, Quote from Yahoo
GCJ16.CMX Gold Futures, Quote from Yahoo
DIA An ETF tracking the Dow Jones Industrial Index.

Some indices are not provided by Yahoo when using the API, but for our purpose, an ETF tracking the index works just as well.

Starting in line 40, we initialize the EventBus and register the data sinks.

In line 68ff, we prepare and start the Scheduler to execute the main loop once a minute.

Next we take a closer look on the Runnable:

public void run() {
 if (failedConsecutiveLoops.i > 5) {
 // something's stinky. Sleep some extra time to give remote
 // systems time to recover.
 if (failedConsecutiveLoops.i % 7 != 0) {
 failedConsecutiveLoops.i++;
 return;
 }
 }
 // sleep offset time
 try {
 Thread.sleep(offsetTime);
 } catch (InterruptedException e1) {
 // ignore
 }

 // read and publish quotes
 QuotesChangedEvent e = new QuotesChangedEvent();
 e.currentBtcQuote = rdr.getCurrentQuote();
 e.quoteSha = yqrSha.getCurrentQuote();
 e.quoteCac40 = yqrCac40.getCurrentQuote();
 e.quoteChf = yqrChf.getCurrentQuote();
 e.quoteZar = yqrZar.getCurrentQuote();
 e.quoteEur = rqrEur.getCurrentQuote();
 e.quoteRub = yqrRub.getCurrentQuote();
 e.quoteGoldFuture = yqrGoldFutures.getCurrentQuote(); 

 e.quoteCny = yqrCny.getCurrentQuote();
 e.quoteDJI = yqrDowJones.getCurrentQuote();
 e.bidBtc = rdr.getBid();
 e.askBtc = rdr.getAsk();
 e.min24Btc = rdr.getMin24();
 e.max24Btc = rdr.getMax24();
 e.volume24Btc = rdr.getVolume24();
 e.vwapBtc = rdr.getVwap();

 // Post event
 eventBus.post(e);

 // reset failed loop counter
 failedConsecutiveLoops.i = 0;
}

The counter “failedConsecutiveLoop” contains the number of runs that have failed so far in a row. Normally it should be 0. From time to time it will be a small number. When it reaches a big number, we assume that something is broken and will not recover in the forseable future. In this case we stop the scheduler and end the program.

QuotesChangedEvent

This class is just a plain data container. For the sake of readability we don’t use getters and setters.

 /**
 * Event to be fired when new quotes have been loaded.
 */
 static final class QuotesChangedEvent {
 double currentBtcQuote;
 double quoteSha;
 double quoteCac40;
 double quoteChf;
 double quoteZar;
 double quoteEur;
 double quoteRub;
 double quoteGoldFuture;
 double quoteCny;
 double quoteDJI;
 double bidBtc;
 double askBtc;
 double min24Btc;
 double max24Btc;
 double volume24Btc;
 double vwapBtc;
 }

Data Sinks

The data sinks are ChartWriter and FileOutputWriter. There is not much to explain. You could use them as a template for other data sinks. The one that probably makes most sense is a relational database.

/**
 * Writes a set of quotes to a line chart.
 */
static final class ChartWriter {
 /**
 * dataset object for chart
 */
 DefaultCategoryDataset dataset = new DefaultCategoryDataset();

 /**
 * initialize the chart.
 *
 * @return JFreeChart object to feed.
 */
 private JFreeChart createLineChart() {
 JFreeChart createLineChart = ChartFactory.createLineChart(
 "data collection", "", "Range", dataset);
 JFreeChart lineChart = createLineChart;
 lineChart.setAntiAlias(true);
 return createLineChart;
 }

 public ChartWriter() {
 try {
 JFrame frame = new JFrame();
 ChartPanel cp = new ChartPanel(createLineChart());
 frame.getContentPane().add(cp);
 frame.setSize(1200, 700);
 frame.show();
 } catch (Exception e) {
 log.severe(e.getLocalizedMessage());
 }
 }

 @Subscribe
 public void recordQuoteChange(QuotesChangedEvent e) {
 String lTimeFormatted = DateFormat.getTimeInstance().format(
 new Date());

 // some normalization for the charts, so all values are in the same
 // ballpark
 dataset.addValue(e.currentBtcQuote / 100, "BTC", lTimeFormatted);
 dataset.addValue(e.quoteSha / 30, "SHA", lTimeFormatted);
 dataset.addValue(e.quoteCac40 / 5000, "Cac40", lTimeFormatted);
 dataset.addValue(e.quoteChf, "CHF", lTimeFormatted);
 dataset.addValue(e.quoteZar * 10, "ZAR", lTimeFormatted);
 dataset.addValue(e.quoteEur, "EUR", lTimeFormatted);
 dataset.addValue(e.quoteRub * 10, "RUB", lTimeFormatted);
 dataset.addValue(e.quoteGoldFuture / 20000, "Gold Fut",
 lTimeFormatted);
 dataset.addValue(e.quoteCny / 10, "CNY", lTimeFormatted);
 dataset.addValue(e.quoteDJI / 1000, "DJI", lTimeFormatted);

 if (dataset.getColumnCount() > 1000) {
 dataset.removeColumn(0);
 }

 }
}

/**
 * Writes a set of quotes to a new line in the output file.
 */
static final class OutputFileWriter {
 /**
 * writer for collected data
 */
 private FileWriter w;

 /**
 * Constructor
 * @param outputFileName not null
 */
 public OutputFileWriter(String outputFileName) {
 try {
 w = new FileWriter(outputFileName);
 w.append("currentQuoteBtc"
 + "\tquoteSha\tquoteCac40\tquoteChf\tquoteZar\tquoteEur\tquoteRub\tquoteGold"
 + "\tquoteCny\tquoteDji"
 + "\tbid\task\tmin24\tmax24\tvolume24\tvwap");
 w.append("\n");
 w.flush();
 } catch (Exception e) {
 log.severe(e.getLocalizedMessage());
 }
 }

 @Subscribe
 public void recordQuoteChange(QuotesChangedEvent e) {
 try {

 w.append(String.format("%.3f\t%.3f\t%.3f\t%.3f\t%.3f"
 + "\t%.3f\t%.3f\t%.3f\t%.3f\t%.3f", e.currentBtcQuote,
 e.quoteSha, e.quoteCac40, e.quoteChf, e.quoteZar,
 e.quoteEur, e.quoteRub, e.quoteGoldFuture, e.quoteCny,
 e.quoteDJI));
 w.append(String.format("\t%.3f\t%.3f\t%.3f\t%.3f\t%.3f\t%.3f",
 e.bidBtc, e.askBtc, e.min24Btc, e.max24Btc,
 e.volume24Btc, e.vwapBtc));
 w.append("\n");
 w.flush();
 } catch (IOException ioe) {
 log.warning("failed to write out data");
 }
 }
}

A few auxiliary methods:


/**
* check if chart parameter has been set
*/
private static boolean checkChartArgument(String[] args) {
return args != null && args.length > 0
&& "chart".equalsIgnoreCase(args[0]);
}

/**
* read the output filename from arguments
*/
private static String getOutputFileName(String[] args) {
return (args != null && args.length > 1) ? args[1] : "output.csv";
}

/**
* clean up when the VM is about to be terminated
*/
private static void addShutdownHook() {
final Thread mainThread = Thread.currentThread();
Runtime.getRuntime().addShutdownHook(new Thread() {
public void run() {
log.warning("VM is being shut down now.");
// stop schedulers
keepRunning = false;
try {
mainThread.join();
} catch (InterruptedException e) {
log.log(Level.SEVERE, "", e);
}
}
});
}

And for the sake of completeness: here are the statics:

 /**
 * Data Collector schedule 1/minute (cron string)
 */
 private static final String SCHEDULE_EVERY_MINUTE = "* * * * *";

 /**
 * Logger
 */
 static Logger log = Logger.getLogger(DataCollector.class.getName());

 /**
 * Data collection proceeds, while the value is true.
 */
 static volatile boolean keepRunning = true;

To run the programm from the console, type


java de.hsec.datascience.btctrader.DataCollector chart myCollectedData.csv

If you (like me) run the data collection on a remote machine without GUI, then you want to use “nochart” instead of “chart” as first parameter.

Congratulations

You have reached the first important milestone. There is still a lot of work ahead:

  • Building the Prediction Neural Network
  • Training the Neural Network
  • Testing the Neural Network
  • Creating a Bitstamp acount and deposit funds
  • Building a trading bot that uses the predictions of your Neural Network for automated transactions on the Bitstamp exchange.

That is quite some ground to cover and it will take us a while to get there, but while you go the next steps, you can already collect that data, that you need later for training and testing.

Collecting more raw data with the Yahoo Quote Reader

2013-07-28_12-04-26_679In the previous post we have built a Quote Reader for Data from Bitstamp. It provides useful market information regarding Bitcoin trading on the Bitstamp platform. That is a good start, but certainly not enough for prognostic purposes. How shall we proceed?

The Bitcoin market is heavily affected by world events and economic factors. The steep November price hike was arguably induced by Chinese gambling trends that emerged during the decline of the domestic stock and housing market. The all time high of the Bitcoin price was a reaction to the imminent expropriation of Cypriot bank customers.

We conclude, that knowledge about the world beyond the Bitcoin markets could be useful for a price prediction. There is one particular class of information, that is easy to acquire and at the same time reflects relevant political shocks and the macro economic situation very well: stock quotes.

YahooFinanceQuoteReader

You have a great range of options for feeding stock ticker data to your software. One of the most prominent is Yahoo finance. The following Java class reads quotes for arbitrary ticker symbols from a Yahoo service that returns them as comma separated values (CSV).

 


package de.hsec.datascience.btctrader;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.logging.Logger;

/**
* Reads quotes from yahoo finance.
*
* @author helmut hauschild
*/
public class YahooFinanceQuoteReader implements QuoteReader {

/**
* Yahoo finance API URL for ticker symbol
*/
private URL yahooFinanceApiUrl;

/**
* the quote last read.
*/
private double currentQuote;

static Logger log = Logger.getLogger(YahooFinanceQuoteReader.class
.getName());

/**
* Constructor
*
* @param pTickerSymbol
* ticker symbol to read
*/
public YahooFinanceQuoteReader(String pTickerSymbol) {
super();
String lUrl = String.format(
"http://finance.yahoo.com/d/quotes.csv?s=%s&f=nl1px",
pTickerSymbol);
try {
yahooFinanceApiUrl = new URL(lUrl);
} catch (MalformedURLException e) {
throw new RuntimeException(
"Cannot initialize YahooQuoteReader class.", e);
}
}

/**
* Read and return a new ticker quote from Yahoo Finance.
*/
public double getCurrentQuote() {
readNextAndUpdate();
return currentQuote;
}

// other getters

public double getBid() {
// We don't have this information. Return 0.
return 0;
}

// ...

/**
* Read current data from the Yahoo Finance and update fields.
*/
private void readNextAndUpdate() {
BufferedReader in = null;
StringBuffer sb = new StringBuffer();
try {
in = new BufferedReader(new InputStreamReader(
yahooFinanceApiUrl.openStream()));

String inputLine;
while ((inputLine = in.readLine()) != null)
sb.append(inputLine);
currentQuote = Double.parseDouble(sb.toString().split(",")[1]);
} catch (Exception e) {
// catch-all because we don't have a surrounding framework (other
// then the
// JRE) to handle unexpected exceptions).
log.severe(String.format("Error reading data from yahoo api: %s",
e.getLocalizedMessage()));
return;
} finally {
try {
if (in != null)
in.close();
} catch (IOException e) {
log.severe("Unable to close input stream from yahoo api");
}
}
}

/**
* main method for simple tests
*/
public final static void main(String[] args) {
YahooFinanceQuoteReader rqr = new YahooFinanceQuoteReader("A1YKTG.F");
log.info("Next Quote: " + rqr.getCurrentQuote());
}
}

A few remarks on this:

  • The BitstampQuoteReader does not need to be initialized with a ticker symbol, because there is only one price to read. The YahooFinanceQuoteReader is a bit different in this respect. For each ticker symbol you are interested in, you create a new dedicated instance.
  • The CSV service returns only the current value and the opening price. This does not match the QuoteReader interface, but you want to implement it anyway, because you want to handle the input data in a unified way as much as possible. As a consequence, you have to provide empty implementations for methods like getBid, getAsk, etc..

The main method is added as an easy way to test the implementation. If you prefer a unit test, it should be fairly easy to move the code there.

GoogleFinanceQuoteReader

The implementation for the GoogleFinanceQuoteReader is very similar to the Yahoo version. The URL can be created like this:


new URL(String.format("http://www.google.com/finance/info?q=%s",tickerSymbol));

The response comes as JSON. I will skip the rest of the implementation here to prevent redundancy. Also, it is not clear, how much longer the service will be available. The Google finance API is deprecated since 2011.

 

 

Start collecting data: the BitstampQuoteReader

img_20151220_145133.jpgThe QuoteReader implementations we need for a start, are luckily quite simple. They really do just one thing: loading information from an URL and returning it on request.

The most important data source for our project will be the Bitstamp API, so we will start with this.

It contains public functions that can be used without authentication and without an API key.  They have a throughput limitation in place though. If you send more then 600 requests in 10 minutes, your IP address will be banned.

For now, we just need one API function: ticker. The URL is

https://www.bitstamp.net/api/ticker/

When you open it in a web browser, you see, that the default response format is JSON. I have used org.json-20120521.jar for JSON parsing. Since the format in the ticker is fairly simple, any JSON java library will probably do the job.

We end up with a function that looks somewhat like this:

 

/**
* Read current data from the Bitstamp API and update fields.
*/
private void readNextAndUpdate() {
BufferedReader in = null;
StringBuffer sb = new StringBuffer();
try {
in = new BufferedReader(new InputStreamReader(
BITSTAMP_API_URL.openStream()));

String inputLine;
while ((inputLine = in.readLine()) != null)
sb.append(inputLine);
} catch (IOException e) {
log.severe(String.format(
"Error reading data from bitstamp api: %s",
e.getLocalizedMessage()));
// IO Exceptions will happen from time to time. If there is no systematic
// problem, the best way for us to deal with them is, to log and ignore
// them. As a consequence, the next time interval will be executed with
// outdated data. But this data is only a minute old, so it will not result
// in completely insane predictions.
return;
} finally {
try {
in.close();
} catch (IOException e) {
log.severe("Unable to close input stream from bitstamp api");
}
}
try {

JSONObject jo = new JSONObject(sb.toString());
currentQuote = jo.getDouble("last");
high24 = jo.getDouble("high");
low24 = jo.getDouble("low");
volume = jo.getDouble("volume");
bid = jo.getDouble("bid");
ask = jo.getDouble("ask");
vwap = jo.getDouble("vwap");
log.info("quote from remote service: " + sb + "\nquote-Time: "
+ new Date(1000 * jo.getLong("timestamp")));

} catch (Exception e) {
// catch-all because we don't have a surrounding framework (other
// then the
// JRE) to handle unexpected exceptions).
log.severe(e.getLocalizedMessage());
}
}

Note: the last trade price is not necessarily a good representation for the current value of a Bitcoin, because it can easily be be manipulated during low volume times. For the prediction, this is ok, because we have a neural network, which will either downgrade the importance of this value, if it turns out not to contribute to future prices, or it even extracts additional predictive power from recognizing the manipulation. Either way is fine for us.

A problem might arise from it though during trading. When you want to sell a Bitcoin at the exchange, and the last price is lower then the fair price would be, then you might be tempted to offer your Coin to a lower price then necessary.

When I have noticed this, my system ran stable for quite a while, so I refrained from changing it. But when you start from the scratch, you might want to keep this in mind.

For the sake of completeness, here is the rest of the class:

package de.hsec.datascience.btctrader;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Date;
import java.util.logging.Logger;

import org.json.JSONObject;

/**
 * Reads a BTC quote from the Bitstamp REST api.
 * 
 * @author helmut hauschild
 *
 */
public class BitstampQuoteReader implements QuoteReader {
  /**
   * Constant API URL
   */
  private static final URL BITSTAMP_API_URL;

  // static initializer for API URL because we must check for
  // MalformatedURLException
  static {
    try {
      BITSTAMP_API_URL = new URL("https://www.bitstamp.net/api/ticker/");
    } catch (MalformedURLException e) {
      throw new RuntimeException(
          "Cannot initialize BitstampQuoteReader class.", e);
    }
  }

  /**
   * Logger
   */
  static Logger log = Logger.getLogger(BitstampQuoteReader.class.getName());

  // fields
  private double currentQuote;
  private double high24;
  private double low24;
  private double volume;
  private double bid;
  private double ask;
  private double vwap;

  /**
   * Accessor for current quote. Updates all fields, so it should be called
   * before reading all other fields. The textbook way to do this would be to
   * create a data object with all relevant fields and return that object. We
   * don't do that because we want to prevent object creation for performance
   * reasons.
   */
  public double getCurrentQuote() {
    readNextAndUpdate();
    return currentQuote;
  }

  //Other Accessors
  public double getBid() {
    return bid;
  }
  ...

  /**
   * Read current data from the Bitstamp API and update fields.
   */
  private void readNextAndUpdate() {
  ...
  }

  /**
   * Main method for a simple test run.
   */
  public final static void main(String[] args) {
    BitstampQuoteReader rqr = new BitstampQuoteReader();
    log.info("Next Quote: " + rqr.getCurrentQuote());
  }
}

Getting started

20160222_091116.jpgUnfortunately every data science project starts with the somewhat tedious task of data acquisition and organization.

To accomplish anything at all, the first thing you’ll need is training data. So before anything else, you want to start collecting a lot of it.

Now, in a professional setting, you want to take some time and think about the volume and structure of your input- and output-data. Therefore I don’t recommend doing at work, what we are about to do next.

We postpone the careful thinking for now, because we want to get things moving, and we are pretty sure that, whatever the result of our (later) deep thinking might be, it will contain exchange rate ticker quotes for Bitcoin. Deliberating on this just a little bit further, we convince ourselves, that other ticker quotes (currency exchange rates, stock prices, economic indicators) might also be useful for the prediction of the Bitcoin price, and that there could be some data points related to the quotes that may give us a little statistical advantage, too.

While musing about that, it occurs to us, that for testing purposes we might also need a mechanism to easily generate random quotes, when no data source is available. And for assessing the quality of the training results, we might later need a mechanism to replay historical quotes to run old data against an updated neural network and find out, how well it would have performed during a certain time interval.

All these considerations lead us to our first little class diagram:

clsdiag_btcquotereader

We see an interface for some sort of adapter (QuoteReader) with none of our favorite design patterns incorporated, and not even a data representation class around. I realize that this is scary. Get used to it! Because this is not an accident. We will do a lot of number crunching and – believe it or not – in this context, it is GOOD practice, to sacrifice beauty and object orientation on the altar of performance. The base rhythm of our architecture will be, to prevent object creation in critical areas whenever possible. We will use arrays instead of collections, unless we need collections in external libraries. We will use primitive data types whenever possible. It will feel a lot like 1985 with one positive side effect: We will be quite happy about these decisions when we try to communicate with the GPU later.

With this said, and the aesthetically minded among you properly scared, we move on to have a closer look on the interface:

package de.hsec.datascience.btctrader; 
/**
 * Interface for Adapter classes to ticker information sources.
 * @author helmut
 */
public interface QuoteReader {
 /** 
 * Returns the current ticker value. Either a price fixed 
 * by a market maker or the last trading price. 
 */
 public double getCurrentQuote();

 /** 
 * Returns the highest bid price in the order book. 
 */
 public double getBid();

 /** 
 * Returns the lowest ask price in the order book. 
 */
 public double getAsk();

 /** 
 * Returns the lowest price of the last 24 hours. 
 */
 public double getMin24();

 /** 
 * Returns the highest price of the last 24 hours. 
 */
 public double getMax24();

 /** 
 * Returns the trade volume of the observed exchange. 
 */
 public double getVolume24();

 /** 
 * Returns the volume-weighted average price. 
 */
 public double getVwap();

}

 

Ok, so we can use an instance of such a QuoteReader to access what seems to be market data from some exchange. The accessor methods come without a timestamp or index parameter so we (correctly) assume, that they return current data. We’ll have to discuss our working definition of the word “current” later.

In the next post, we’ll take a closer look on the implementations, especially the BitstampQuoteReader.

Predicting Bitcoin Prices

In this initial blog series, I am going to report on an automated bitcoin trading system, that I have build in 2014 and sucessfully operated during 2015.

The decision making component in this trading system incorporates machine learning methods: mainly a neural network and – in a data preparation step – principal component analysis (PCA).

The code was written in Java and Matlab. It is not always pretty, so please when reading through it, keep in mind, that this has started as a hobby project.

Some of the code I can not publish, which I will explain when I come to it. But I will point out how to fill in the gaps.

I would like to encourage people to rebuild the system, use it to try out their own ideas and share them with the rest of us. Also I want to point out, that while bitcoin trading is a good point to start, it is certainly not the only area, where these methods are applicable.

Why is bitcoin a good point to start? Because of an excellent technological infrastructure and immediate financial rewards, to name a few reasons. Also Bitcoin is cool, which for me has some value on it’s own.

In the 12 months of operation, the system initiated roughly 11000 transactions on Bitstamp, a Bitcoin exchange which among other things allows to trade Bitcoin against fiat currency (USD). The system yielded a gross revenue a little above 26%. After transaction fees, a pre-tax return around 20% remained. The result after taxes is a wholy different story, which we will talk about in a later post.

Now, a buy and hold strategy during this year would have given me the same revenue during this time interval, even with less transaction fees. But I could not have known that in the beginning of the year.

The approach of the trading system is obviously completely different. It tries to predict small movements in the near future (a few minutes) based on observed market activities, news, economic data and a few other factors. In essence, it exploits the prices’ volatility. The beauty of this is, that it works almost as well, when the overall direction is southwards.

During the first months of the year, while doing it’s first clumpsy, inexerienced trading steps, the system has recorded the input data and added it to an increasingly larger body of training data. The neural network has been trained and retrained several times, each time with more input data. The results turned out increasingly better. From January to April the trading yielded net negative results while the overall market went sidewards. After that the results where positive, even during a severe market decline in November. The last training took place in May. Due to memory constraints (and because the training time has passed 24 hours), training with more data would have made a different approach necessary. Since the results were already satisfactory, I have decided to stick to what I have. So that is, where we are now: Having quite some room for improvement.

In the next few posts, I will very briefly lay out the theoretical foundation to the project, before we take a closer look into the code.