Date: 2024-12-05

Degree: Doctoral Thesis

Programme: Doctor of Business Administration

Authors: Lin Chin Yang

Supervisors: Professor Joao Alexandre Lobo Marques, University of Saint Joseph


Abstract:

The stock market’s inherent volatility and complexity pose significant challenges for investors seeking to optimize their strategies. This thesis addresses the critical need for improved forecasting methods in stock price prediction by proposing a hybrid approach that combines traditional machine learning (ML) techniques, specifically Support Vector Machines (SVM) and Long Short-Term Memory (LSTM) networks, with sentiment analysis derived from financial news and social media platforms. 

The research establishes a theoretical framework integrating quantitative data, such as historical stock prices, with qualitative sentiment data to enhance prediction accuracy. The study involves the collection of a comprehensive dataset covering stock prices and sentiment scores from various sources, including news articles and social media posts, from January 2010 to December 2023. Rigorous data preprocessing techniques, including normalization and feature engineering, are employed to prepare the data for analysis.

A comparative analysis of the SVM and LSTM models uses multiple performance metrics, including Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and classification accuracy. The findings reveal that the LSTM model significantly outperforms the SVM model in predictive accuracy, demonstrating its capability to capture complex temporal dependencies inherent in financial time series data.

Furthermore, integrating sentiment analysis significantly enhances the predictive performance of both models. Notably, transformer-based sentiment analysis techniques, such as BERT and DistilBERT, provide superior sentiment classifications compared to traditional methods like VADER and TextBlob. The empirical results indicate that incorporating sentiment data leads to an average accuracy improvement of 12.8% over models that rely solely on historical price data.

This research contributes to the evolving field of financial forecasting by emphasizing the importance of a hybrid approach that amalgamates quantitative and qualitative data. The implications of these findings extend beyond academic research, offering valuable insights for investors and financial analysts seeking to leverage advanced predictive models to navigate market uncertainties. Ultimately, this dissertation advocates adopting sophisticated hybrid models that enhance stock investment strategies and decision-making processes in the finance sector.

Keywords: Stock Investment Prediction, Artificial intelligence (AI), Long Short-term Memory (LSTM), Machine Learning (ML), Support vector machines (SVM), Sentiment Analysis, Bidirectional Encoder Representations from Transformers (BERT).