The biggest fundamental challenge for deep learning researchers to create an artificial intelligence and machine learning models to make good decisions under high uncertainty environment. This is highly applicable to many industries, especially in trading in the cryptocurrency market. Guiding from the first principle approach, reinforcement learning involves what we believe is the ground truth in cryptocurrency trading behaviors, which are:
Delayed Consequences / Reward
Exploration vs. Exploitation
These 4 key components will have a huge role in capturing the optimal policy for an artificial intelligence agent to behave in a particular cryptocurrency market. We believe that understanding delayed consequences is one of the most fundamental aspects in cryptocurrency trading and in any reinforcement learning algorithm—suggesting that an artificial intelligence agent manages to feasibly balance out exploration and exploitation (short-term vs. long-term reward) through millions of simulations conducting trial and error learning approach. In cryptocurrency trading, our artificial intelligence agent must master the following:
Non-stationary data. Let’s take Bitcoin for example, the volatility index of Bitcoin over 30 day estimate is roughly around 3.96%, 60-day roughly 3.3%, 120-day estimate is 3.28% and 252 day estimate is 4.02%. By these standards, we can safely assume that bitcoin price is highly volatile and therefore historical data will show a non-constant mean and variance.
High-dimensional, continuous action space. Each artificial intelligence agent will have 3 major parent actions that will ultimately affect the total sum of reward or punishment in the end of its trading session. Each action taken in specific state of the environment will affect the end result or sets of actions taken previously that will be measured in the end by the total of profit or loss made.
High-dimensional, observation space. There are many feature engineering must be taken into account in order to fully utilize the use of technical and statistical analysis for finding the optimal function behavior of a particular cryptocurrency market. There area total of roughly 4 major types of indicators in technical analysis with a total of at least 20 – 30 popular metrics each. This creates a high-dimensional observation space for our artificial intelligence to learn if there’s a certain correlation for each indicator towards the market. In addition to that, in order to roughly predict future closing price the artificial intelligence agent would need to have a representation of a stationary data from the market.
Also it’s important to mention that each trade made will also consist of transactional fees based upon the exchange the agent is placed. The logic for each action taken by the agent takes milliseconds per tick to execute.
We believe that through reinforcement learning, we can acquire an optimal policy function that is applicable to real-world continuous environment use-cases like cryptocurrency trading.
If you are interested to learn more about our approach in tackling this problem, we’re hiring!
1 AI scalping algorithm measures the probability of success over historical scalping trades and updates in real-time with current or ongoing scalping opportunities presented. 2 DeepMind Technologies is a UK based computer programs company founded in September 2010, and acquired by Google in 2014.
3 OpenAI is an artificial intelligence company founded in December 11, 2015 based in San Francisco.
4 What can be measured as a “good” decision can be measured by its reward/consequences given after an agent’s action towards its environment. 5 Taken on November 24, 2019. Measurements and calculations are based on the formula from investopedia.
6 Major actions: Buy, Hold, Sell with at least 4 child action for each; 25%-100% total capital and market taker/market maker. 7 The word predict here embodies the calculated predicted future price of such market based upon high-dimensional observation space with a confidence level for each predicted price.
I&E AI utilizes computer vision technology to help understand and classify major directional changes in the cryptocurrency market. This is one of the key aspects of our decision making process in calculating and managing risks. With large amounts of training dataset I&E AI generalizes market patterns to classify directional changes by processing an image of a cryptocurrency chart. The key here is to not overcomplicate our overall decision-making process; we believe that there are 5 key directional changes or continuation that will support a self-sustaining and reliable ground truth classified data that can be utilize for I&E AI trading bot.
One of our biggest challenges in this project is to be able to provide a high relevance training datasets to our model—as it is our bottleneck of achieving a state-of-the-art market trend classifier. Our market trend classifier must master the following areas:
Vanishing gradient problem.
Many supervised learning models experience the vanishing gradient problem, which will ultimately limits the overall accuracy of the model.
Labeled large datasets.
Classifying a high-dimensional image with high precision requires large training datasets. Our datasets consists of cryptocurrency chart images that are labeled properly with implicit variance in colors and dimensions. This variance is crucial to our overall model performance to avoid over fitting and in the end generalize better.
Managing large datasets can be difficult when it comes to optimizing for precision. In order to cleanour data, our overall training and validation datasets must have a high relevance score overall. In the end, images with low relevance might increase the overall false positives classified by our model; this might need further data cleaning processes in order to increase the overall precision.
We have discretized major trend directional changes into 5 different classes: ascending triangle, descending triangle, consolidation, inverse head and shoulder and head and shoulder. These 5 major patterns embody the characteristics of market directional changes or trend continuation. For instance a definitive ascending or descending triangle pattern in higher timeframes will generally have a stronger probabilityof trend continuation. In contrast a market that has a definitive head and shoulder or inverse head and shoulder pattern will generally have a stronger probability of a change in market direction.
In order for us to accumulate large datasets we must be agile in retrieving high relevance images from many platforms. What better platform that stores a high relevance images than Google image? For starter, we accumulated data from Google image retrieving at least 1,000 images for each class that we are trying to classify. For this specific project, we need more than that, at least 10,000 images for each class with variance in colors and dimensions needed to generalize with a high precision overall. For most of our data, we ignored data that has angle transformations as most of the time we process flat angle screenshots of charts as our input to our model. Also, we have to ignore mirroring and rotation transformation of our images, as the direction of the candlestick is fundamental towards classifying features in the image.
To overcome the vanishing gradient problem, we decided to use FastAI’s ResNet34 architecture for our model. The sole purpose of using this architecture is for our model to be able to utilize higher parameters in building a very deep network without degrading the overall performance
With a total of roughly a total of 40,000 data in our set, we decided to segmentize 20/80% distribution between validation and training dataset. Our naïve approach was to train the model with the standard learning rates and use error rating as our overall metrics.
able 1: naïve training approach with standard learning rate using error rate as our overall metrics.
As we can clearly see that the model is not performing well at all. It has an error rate of roughly 44.59% by the end of its 5th training cycle. We will need to re-tune the learning rate in order to minimize the error rate.
With the help of FastAI library we can utilize theirlr_find() model fitting method in order to expedite our process of finding the optimal hyper-parameter needed to train our model. By convention we will use the learning rate (x-axis) before the loss value (y-axis) flies of the roof which is at (1e-02), then use the learning rate 10x smaller (1e-03).
Loss vs. Learning Rate
Chart 1: lr_find uses cyclical learning rates for each iteration to measure loss.
After this hyper-parameter tuning, we can train our model again and see how well our model perform.
Table 2: training model after learning rate fine tuning using error rate as our overall metrics.
Error rate shrinks down to 27.41% after hyper-parameter tuning—suggesting an overall accuracy of 72.59%.
Transfer Learning and Data Processing
In order to achieve a higher accuracy in our model, we implemented a transfer learning approach into our classification problem. We trained our model with image sizes of 224x224 and through transfer learning approach; we will train our model using a smaller size of 128x128 images. The general theory in this approach is to have a pre-trained model (the first trained model) to extract general low-level features such as edges, patterns and gradients and then train this model with a smaller image size to identify more detailed features within the graphs such as candlestick trend patterns and trend direction. However, before we start this process we would need to do process our data from previous iterations results.
Using FastAI’s ClassificationInterpretation class, we can easily visualize our model’s overall performance and more importantly distinguishing the false positives. In order to increase our overall precision, we must understand the causes of our model’s false positives. What we have found out is that there are many misclassified images in our datasets, images with high distortion, images with low relevancy and some are just wasn’t suppose to be there. In response to this, a data processing was conducted to reduce the overall false positive rate of our model. This data processing is crucial to our optimization approach to increase the accuracy before implementing transfer learning to our model.
Heat Map Chart 1: confusion matrix visualize the model's prediction distribution over x amount of iteration.
If we look at the confusion matrix above, we can see the distribution of our model's false positives for each classes. High missclassifications are apparent between classifiying head and shoulder and the inverse of it, also there's a sligh high volume of false positives between classifying consolidation and descending triangle iamges. We focus more on these high missclasifications and pin point the characteristics of these data that might affect the model's prediction. As previously hypothesized, there are images that shows low relevancy with the class including images with high noises and distortion. These images will affet the overall performance of the model and should be cleaned from the training set.
Images with high noise and distortion affects the overall performance in generalizing patterns and edges for specific classes.
Our approach was to use the same 40,000 datasets excluding the bad training data that was cleansed previously, and transformed it into smaller size image (128x128) with the same 20/80% validation and training distribution. Furthermore, we repeated the overall training method but this time we find the optimal hyper-parameter before training our model with the smaller size data. This time our metrics will be accuracy of the model itself instead of the error rate.
Table 3: transfer learning approach with fine-tuning learning rate using accuracy as our overall metrics.
As hypothesized previously, the overall error rate of the model decreased significantly due to the fact that detailed features of the graphs are extracted a lot better after implementing transfer learning. Previously, when the model was trained with a bigger image size, it can only extract general macro features such as edges, gradients and shapes.
As shown in table 3, the model reached its maxima of roughly 95.1% overall accuracy. However, we can see an increase in training and validation error over 10 periods of iteration—proving that the current model may need another approach of optimization if we want to increase the overall accuracy. However, we don’t want to over fit our model, as the current 95.1% is sufficient enough for generalizing trend directions.
We have combined our AI vision technology to be the first tier filter in our overall scalping algorithm to confirm the market's direction before applying our complex Technical Analysis for potential profit scalping setups
Analyze the scalping types from 1 to 6 based on the judgment of the neural network, and look for the Trade points that gain the exchange rate profit
AI-based automatic transactions started using APIs in BINANCE and INDDAX. This has made our business simpler
・Basic strategy ="Strategy using deep reinforcement learning" This strategy is very different from conventional ones because there is no specific indicator or signal to start trading, and AI agents are innumerable through trial and error and given observations. It is done by popping out the highest probability of action. Basically, our strategy is called "deep reinforcement learning" (which is currently one of the world's leading AI algorithms).
・Trading exchange ="Binance" or any other exchange
・Analyzing and trading all technical indicators with AI
・Sharp ratio (risk: return ratio) = 2 or more onl
・Only buying and selling in kind
・Trade only the top currency USDT pairs (prefer the ones with the most liquidity for 24 hours)
・Trade only tradable opportunities filtered with a unique 512 layer neural network
Deep reinforcement learning strategy
・Managed Risk Dynamic Risk Order Strategy:
It is a discrete action model strategy that manages risk and determines actions based on the dynamic order of buying and selling. Managed risk orders follow the stop loss and profit margin percentage boundaries for all trades. However, This action model also allows agents to cancel orders and buy or sell orders with a specific trade size. In essence, AI agents predict the highest probability of each action.
Currently, AI has 8,424 possible actions. This includes stop losses in the range (0.2%-> 1.5%), trade sizes in the range (10% -100% of total capital), and profitable trades in the range (0.2%-> 3.15%). .. The strategy our AI is currently using can be obtained by observing current market statistics from technical indicators such as:
・This strategy embodies the Actor-Critic Architecture (PPO), exhaling the highest probability of each action performed by the actor and the highest probability (one trade). The Critic model then predicts how well that particular action will perform, based on the current observations (which consist of all the technical analysis above). This strategy is run on millions of simulations until the AI agent converges on the best probability estimation result.
・This strategy is very different from the traditional because it lacks a certain specific indicator or signal to initiate a trade, and is performed by AI agents popping out the highest probability of action through countless trial and error and given observations. .. Basically, our strategy is called "deep reinforcement learning" (which is currently one of the world's leading AI algorithms).