Case Study

Last updated 2 months ago

Case Study

3.1 Input Data and Processing

The model training utilizes Binance spot price data for BTC/USDT spanning 4 years from November 2020 to November 2024. The data comprises OHLCV (Open, High, Low, Close, Volume) features resampled at 3-second intervals. Using a 10-minute forecast window, the model predicts whether the future average price will exceed, fall below, or remain approximately equal to the average price of the preceding 10-minute window.

3.2 Processing Methodology

Rolling training approach: training on 6-month segments and testing on subsequent 3-month periods.
Advancement in 3-month increments across the 4 years.
Feature set incorporating OHLCV price aggregations complemented by technical analysis indicators.
Optimization through feature selection based on statistical correlation with future returns.

3.3 Classification Framework

Price movement classification is based on the percentage change between weighted averages of past and future windows. Linear weighting prioritizes recent values within each window to enhance prediction realism. A label is assigned according to whether the movement lies within, above or below the percentage threshold. A movement threshold of 0.08% was chosen to provide optimal balance between the labels over the 4 years.

Classification Formula:

3.4 Model Output and Confidence Measure

The model generates a probability distribution across three classes: upward, downward, or stationary movement. Rather than providing a simple categorical prediction, it outputs class-wise probability estimates. We derive a confidence measure using the entropy of these probabilities, where lower entropy indicates higher prediction confidence and higher entropy suggests uncertainty.

The final output comprises the most probable label and its associated confidence level, formatted as:

Prediction: label 2 (up), Confidence: 80%.

This approach provides not only the predicted price movement but also a measure of how reliable the prediction is, allowing for more informed trading decisions based on both forecast direction and confidence level.

3.5 Results

The results of our 10-minute prediction model show promising accuracy and robustness in predicting price movement. Specifically, predictions with a confidence greater than 50% exhibit consistent accuracy over the tested periods.

Over the 4-year sample period, Yo-Yo’s model achieved a mean test accuracy of 79.1% for high-confidence predictions (>50% confidence), with the accuracy remaining stable across diverse market conditions.

As shown in Figure 1, the accuracy of predictions with greater than 50% confidence held consistently within the 74-84% range throughout the test periods from March 2019 to October 2024. The plotted data showed minor fluctuations, but an overall trend of stability, indicating that the Yo-Yo model effectively captures market signals and reduces prediction errors despite the volatility.

Figure 2 displays the percentage of predictions which surpass the confidence threshold of 50%, this demonstrates variability against differing volatility, with a mean of 42.5% of predictions being considered.

Figure 3 shows the support for different classes (price movement up, down, stationary) throughout the runs. The distribution of support across classes 0 (down), 1 (stationary), and 2 (up) remained fairly balanced during most of the evaluation period. However, there was a notable increase in the support for class 1 (stationary) between late 2022 and early 2023. This indicates a period of lower market activity or higher price stability during that window, consequently increasing the confidence in the predictions, as reflected in Figure 2.

Despite having more high-confidence predictions, a slight decrease in accuracy was observed, suggesting that price-movement prediction is more challenging in a sideways market. In periods of higher volatility (where support for label 0 was proportionally the lowest) the model produced fewer predictions, reflecting increased output entropy in periods of uncertainty. However, while the model made fewer predictions in these periods, it maintained a strong accuracy greater than 80%.

PreviousModel Output NextConclusion