Abstract

Financial time-series forecasting is challenged by non-linear, non-stationary dynamics driven by macroeconomic factors, market sentiment, and stochastic events. Traditional statistical models assume stationarity and linear dependencies, failing to capture complex temporal patterns, while deep learning approaches struggle with vanishing gradients and long-term dependencies. Standard Transformers incur high computational costs (quadratic complexity, , per layer) due to attention mechanisms and large parameter counts, where is the sequence length and is the model dimension. This study proposes LiteFormer, a lightweight, encoder-only Transformer for univariate stock price forecasting, leveraging encoder layers with multi-head self-attention and feed-forward networks ( ). Operating on sequences of closing prices ( , ), LiteFormer employs sinusoidal positional encodings, a causal mask, dropout ( ), and layer normalization to model temporal dependencies and enhance generalization. With only 750,000+ parameters, LiteFormer reduces per layer complexity via compact design, thereby enabling low-latency inference (38 millisecond) and energy efficiency (96.894 Watt), which promises to offers scalable real-time inference for industrial fintech systems. Experiments across 30 stocks from the S&P 500, FTSE 100, and Nikkei 225 indices demonstrate Mean Absolute Error and Root Mean Square Error reductions of 3.45%–9.09% over vanilla Transformers and up to 48% over recurrence neural models for high-volatility stocks. LiteFormer’s efficient, interpretable architecture, driven by attention weights, offers a scalable solution with potential for multivariate extensions and real-world multi-modal applications in predictive domain.