In this demo we show how to forecast if the closing price of NASDAQ-100 is moving up or down. We do this with the help of a Random Forest Classifier. I tried to replicate the result from a research paper authored by Luckyson Khaidem, Snehanshu Saha, Sudeepa Roy Dey:
The solution has been implemented in Scala using Jupyter with the BeakerX kernel using the following libraries
We add the necessary java libraries with the help of Maven...
%classpath config resolver maven-public http://software.pschatzmann.ch/repository/maven-public/
%%classpath add mvn
ch.pschatzmann:investor:LATEST
ch.pschatzmann:jupyter-jdk-extensions:LATEST
com.github.haifengl smile-scala_2.11 1.5.2
... and we import all relevant packages
import org.ta4j.core.Indicator
import org.ta4j.core.num.Num
import org.ta4j.core.indicators._
import org.ta4j.core.indicators.volume._
import org.ta4j.core.indicators.helpers._
import ch.pschatzmann.stocks.Context
import ch.pschatzmann.stocks.ta4j.indicator._
import ch.pschatzmann.stocks.integration._
import ch.pschatzmann.display._
import org.ta4j.core.Indicator import org.ta4j.core.num.Num import org.ta4j.core.indicators._ import org.ta4j.core.indicators.volume._ import org.ta4j.core.indicators.helpers._ import ch.pschatzmann.stocks.Context import ch.pschatzmann.stocks.ta4j.indicator._ import ch.pschatzmann.stocks.integration._ import ch.pschatzmann.display._
First we can use the StockTimeSeriesEMA for the exponential smoothing of the original input time series. We generate a plot with different smoothing periods:
import scala.collection.JavaConverters._
// Use exponentially smoothed time series
var timeSeries = Context.getStockData("NDX").toTimeSeries()
var timeSeries0 = StockTimeSeriesEMA.create(timeSeries, 0)
var timeSeries5 = StockTimeSeriesEMA.create(timeSeries, 5)
var timeSeries10 = StockTimeSeriesEMA.create(timeSeries, 10)
var timeSeries20 = StockTimeSeriesEMA.create(timeSeries, 20)
var close = new ClosePriceIndicator(timeSeries)
var close0 = new NamedIndicator(new ClosePriceIndicator(timeSeries0),"close-0")
var close5 = new NamedIndicator(new ClosePriceIndicator(timeSeries5),"close-5")
var close10 = new NamedIndicator(new ClosePriceIndicator(timeSeries10),"close-10")
var close20 = new NamedIndicator(new ClosePriceIndicator(timeSeries20),"close-20")
var table = Table.create(close, close0,close5,close10,close20)
new SimpleTimePlot {
data = table.seq()
columns = Seq("ClosePriceIndicator","close-0","close-5","close-10","close-20")
showLegend = true
}
We can easily generate the described input input features and output labels with the help of ta4j Indicators and then we display the data as a table.
import scala.collection.JavaConverters._
var timeSeries = timeSeries10
// Relative Strength Index
var close = new ClosePriceIndicator(timeSeries)
var rsi = new RSIIndicator(close, 10)
// Stochastic Oscillator
var sk = new StochasticOscillatorKIndicator(timeSeries, 10 )
var sd = new StochasticOscillatorDIndicator(sk)
// Williams %R
var williamsR = new WilliamsRIndicator(timeSeries,10)
// Moving Average Convergence Divergence
var macd = new MACDIndicator(close)
// Price Rate of Change
var roc = new ROCIndicator(close, 10)
// On Balance Volume
var obv = new OnBalanceVolumeIndicator(timeSeries)
// Label
var offsetIndicator= new OffsetIndicator(close, +10)
var difference = new DifferenceIndicator(close, offsetIndicator)
var label = new SignIndicator(difference)
var in:List[org.ta4j.core.Indicator[org.ta4j.core.num.Num]] = List(rsi,sk,sd,williamsR,macd,roc,obv)
var out:List[org.ta4j.core.Indicator[org.ta4j.core.num.Num]] = List(label)
val table = Table.create(rsi,sk,sd,williamsR,macd,roc,obv, label)
Before we use the data in machine learning we shuffle it and split it into a training and test dataset
table.shuffle()
val tuple = table.split(0.8)
val training = tuple.x
val testing = tuple.y
s"Size: ${training.size} / ${testing.size}"
Size: 1423 / 355
Smile needs Arrays as input. We can convert our Table data into Arrays by calling to1DArray and to2DArray. The Sign Indicator is between -1 and +1 but Smile expects the values as integers starting from 0. We convert the data to integers and add 1 so that we get 0 for negative, 1 for indifferent and 2 for positive.
We can convert the table data by indicating the fields or exclusion fields (with a - prefix). Because we get the numbers as java.lang.Double we need to convert themto Scala Doubles:
import smile.classification.RandomForest;
val trainLabel = training.to1DArray("SignIndicator").map(n => n.toInt + 1)
val trainData = training.to2DArray("-SignIndicator","-time").map(_.map(_.toDouble))
val numberOfTrees = trainLabel.toSet.size
val model = new RandomForest(trainData, trainLabel, numberOfTrees);
smile.classification.RandomForest@112b049f
Here is the calculation of the accuracy of our predictions (for the test data)
import smile.validation._
val testLabel = testing.to1DArray("SignIndicator").map(n => n.toInt + 1)
val testData = testing.to2DArray("-SignIndicator","-time").map(_.map(_.toDouble))
var labelPredict = new Array[Int](testData.size)
for (i <- 0 to testLabel.length-1)
labelPredict(i) = model.predict(testData(i))
val accuracyResult = accuracy(testLabel, labelPredict);
0.7492957746478873