Prediction and Optimization of Civil Aviation Flight Delays Based on Machine Learning Algorithms

Qingwei Zhong¹,
Yingxue Yu¹,
Yiru Huang¹ &
…
Tianhang Zhang¹

3327 Accesses
9 Citations
Explore all metrics

Abstract

The civil aviation industry continues to face the significant problem of flight delays, which impacts operational efficiency and passenger satisfaction. This research aims to develop an innovative model that accurately predicts civil aviation flight delays and provides insights related to performance enhancement. The proposed Flight Delay Prediction Network with Spatio-Temporal Learning (FlightNet-ST) is a hybrid deep learning architecture that combines Long Short-Term Memory (LSTM) networks, Graph Convolutional Networks (GCNs), and 3D Convolutional Neural Networks (3D-CNNs) to achieve this goal. The model is trained using datasets of domestic flights that include geographical, operational, and temporal data, such as geographical separation, origin–destination pairs, airline rules, scheduled departure times, and dates. The approach involves running time series data through LSTM to capture temporal dependencies, applying 3D Convolutional Neural Networks (3D-CNNs) to analyze aircraft route grids dynamically, and utilizing Graph Convolutional Networks (GCNs) to discover topological patterns from spatial airport connectivity. Delay prediction is powered by a unified representation that fuses these disparate elements. Based on the experimental data, FlightNet-ST achieves a 14.47% reduction in Mean Absolute Error (MAE). Additionally, an attention mechanism enhances interpretability by highlighting key aspects that influence delays, such as departure time blocks and airport-specific trends. Finally, FlightNet-ST helps with civil aviation flight delay prediction and management with its data-driven, interpretable, and robust solution. This methodology facilitates real-time operational decision-making and provides tactics to mitigate delays.

Enhancing Aviation Efficiency Through Big Data and Machine Learning for Flight Delay Prediction

Enhanced Aircraft Time Delay Prediction Using Weighted Hybrid ML and Dimensionality Reduction

Flight Arrival Delay Prediction Using Gradient Boosting Classifier

1 Introduction

Flight delays continue to affect civil aviation, airline performance, customer satisfaction, and financial efficiency. Accurate delay prediction is crucial as global air traffic continues to increase. Although valuable, traditional models struggle with the complexity and multidimensionality of flight data. Modern machine learning methods are promising but lack spatial–temporal analysis and interpretability. FlightNet-ST, a hybrid deep learning framework combining LSTM, GCN, and 3D-CNN models, addresses these constraints, to improve prediction accuracy and enable real-time decision-making in dynamic aviation contexts.

1.1 Overview of Flight Delay Prediction in Civil Aviation

Flight delays plague civil aviation, hurting operational planning, air traffic management, customer satisfaction, and airline profitability. Global aviation studies indicate that delays lead to increased fuel expenditures, resource misallocation, and cascading disruptions throughout interconnected aircraft networks [1]. Modern airlines require accurate flight delay prediction as urbanization and globalization increase passenger demand, delay frequency, and their associated impact.

Traditional statistical models and advanced machine learning are used in delay prediction studies. Early models employed historical patterns and linear regression, but data-driven models utilizing complex spatio-temporal information have become increasingly popular. This transition allows airport authorities and airline operators to make more proactive decisions by extracting deep insights from aviation databases [2]. Deep learning architectures have emerged because they can handle complex and diverse inputs.

Despite these developments, conventional models often overlook the multidimensional dependencies of flight operations. Airport geographical connections, dynamic temporal patterns, and real-time operational metrics must be studied to improve forecast accuracy. Multiple learning paradigms—temporal, spatial, and topological—are becoming a possible solution to the present methodology's limitations [3].

1.2 Need for Intelligent Predictive Systems

The disadvantages of conventional scheduling and delay management methods have been more noticeable in recent years due to the ever-increasing demand for airports and airlines. Unfortunately, most current systems are not smart enough to foresee such disruptions, so they have to react after the fact, which is wasteful and expensive.

Academics and professionals in the field are increasingly in agreement that data-driven predictive solutions have the potential to revolutionize the management of delays in civil aviation [4, 5]. These solutions can help with the better allocation of resources, improved decision-making, and, ultimately, happier customers. Machine learning has made it feasible for airlines to anticipate delays with greater precision and clarity, enabling them to take preventive actions to mitigate the impact of these occurrences [6].

This research presents an innovative and reliable framework for predicting flight delays. With its attention mechanism and foundation in LSTM neural networks, it seeks to reliably predict flight delays and isolate the most critical operational and temporal variables that cause them [7]. The extensive dataset on which the model is trained includes several flight and schedule-related parameters, including historical domestic flight records [8].

1.3 Research Problem Definition

The increased complexity of air traffic systems makes flight delay prediction an open research subject that requires novel, hybrid solutions. Multiple variables, including scheduled departure time, aircraft movement patterns, weather, and network congestion, form a complicated system where delays spread unpredictably. These issues require an intelligent system that integrates multiple data sources and learns complex correlations over time and space [9].

Flight delay prediction frameworks, such as Random Forest and Gradient Boosting, fail to capture event temporal sequences and airport spatial dependencies. Uninterpretable models hinder stakeholders' practical use of actionable insights, not just projections [10]. Developing a robust, interpretable model that utilizes advanced spatio-temporal learning to estimate civil aviation delays is the primary research problem.

1.3.1 Objectives of the Research

To create FlightNet-ST, a hybrid deep learning model incorporating temporal, spatial, and dynamic trajectory data utilizing LSTM, GCN, and 3D-CNN architectures for predicting civil aviation flight delays.
An attention technique that highlights crucial spatio-temporal aspects affecting delay consequences improves model interpretability and supports aviation stakeholders' decision-making.
To evaluate the model's prediction power and generalizability using real-world flight datasets and compare it to other machine learning methods.

1.4 Methodology Overview

FlightNet-ST, a hybrid deep learning framework comprising LSTM networks, GCNs, and 3D-CNNs, addresses the drawbacks of conventional models. This design allows the model to learn from temporal sequences, spatial airport connections, and aircraft trajectory data. Model components and their roles are as follows:

LSTM networks are a powerful tool for modeling the temporal relationships in flight behavior, essential for processing sequential data like flight dates and scheduled departure times [11].
GCN is used to analyze the spatial inter-airport relationships using origin–destination networks and consider the flight network's spatial dependencies [7, 12].
Due to the dynamic nature of flight routes over time, 3D-CNNs are employed to extract features from dynamic flight trajectory grids [13].

An attention mechanism has been incorporated into the model to enhance its interpretability. Transparency into the decision-making process, this technique focuses on essential spatio-temporal aspects that substantially influence delay estimates [14].

The FlightNet-ST model offers several key contributions to the field of flight delay prediction:

FlightNet-ST improves prediction accuracy by integrating LSTM, GCN, and 3D-CNN components into a single architecture. This architecture gathers spatial, temporal, and dynamic trajectory information simultaneously.
The attention method enhances the model's transparency for domain specialists by helping them understand which parameters, such as departure time blocks or airport-specific delays, are most significant in the delay prediction.
Compared to CNN-GCN-LSTM combinations, FlightNet-ST achieves a Mean Absolute Error (MAE) reduction of 14.47%, making it a superior model above classical models like Gradient Boosting and Random Forest.
Real-world domestic flight data are utilized to develop the algorithm. Flight time, place of departure and arrival, date, and day of the week are among the operational, geographical, and temporal variables in this dataset. It improves model generalizability.
To aid airlines in optimizing their schedules and reducing the cascading effects of delays, FlightNet-ST generates quick and interpretable predictions, which can enhance real-time decision-making.

1.4.1 Key Contributions

To develop FlightNet-ST, a unifying multi-modal deep neural architecture that combines LSTM, GCN, and 3D-CNN modules to make precise predictions for flight delays by learning temporal, spatial, and trajectory-based characteristics.
To propose a new airport-level congestion measure that estimates operational saturation and incorporates it into the spatial and temporal learning procedures of the model.
To improve the interpretability of the model through an attention mechanism to determine the most impactful spatio-temporal features that contribute to flight delays.
To assess the suggested model using regression metrics (MAE, RMSE, MAPE) and binary classification metrics (accuracy, precision, recall, F1 Score) and determine its outperformance over baseline machine learning and deep learning methods.

1.4.2 Novelty of the Work

The novelty of this work lies in the combination of temporal, spatial, and dynamic trajectory data in a single predictive model—FlightNet-ST—utilizing LSTM, GCN, and 3D-CNN architectures. In contrast to current models that focus on one dimension of data, our method teaches joint sequential flight behavior, airport network structure, and en-route dynamics. This work also proposes a real-time congestion index, which provides the model with a richer contextual understanding of airport saturation. The addition of an attention mechanism introduces explainability by focusing on relevant features that cause delay. This integrated, explainable, and scalable model represents a significant step forward in data-driven flight delay prediction, offering both accuracy and operational practicability.

1.4.3 Key Challenges

Flight delay forecasting is plagued with a few critical issues preventing existing models from being of any use. First, the problem is with complex, high-dimensional data that involves temporal patterns, spatial airport networks, and operational factors, which cannot be modeled using classic or unimodal methods. Second, most existing methodologies are unable to incorporate spatio-temporal correlations, resulting in low forecast accuracy. Thirdly, real-time changes in flight paths or en-route conditions are typically ignored, thereby limiting the potential of modeling mid-flight interruptions. Additionally, most deep learning models applied in such scenarios struggle with low interpretability, making it difficult to distinguish the exact determinants of delay outcomes. Finally, hitherto developed systems are not highly effective in reacting quickly to changing operating environments, such as weather fluctuations and airport congestion, which limits their application in real-time decision-making contexts. These problems indicate the need for an interpretable, unified, and adaptive model for flight delay prediction that accurately.

Every part of this paper is constructed according to a reasonable and research-based sequence. Section 2 includes an extensive literature review related to the development of flight delay prediction and multi-model ensembling. Section 3 presents the methodology, design, data preprocessing, and ensembling of the fundamental building blocks of the introduced FlightNet-ST model. Section 4 presents the dataset properties, performance measures, and experimental scenario. Section 5 presents detailed results, highlighting the model's prediction accuracy and the clear conclusions obtained through the attention mechanism. Lastly, Sect. 6 concludes the work and its potential future research and improvement path.

2 Related Work

Airline delay prediction, traffic forecasting, and resource management have led to an increased use of machine learning, simulation models, and performance enhancement frameworks. Cognitive error analysis, regression modeling, neural networks, and reinforcement learning are used in investigations of aviation, healthcare, and business processes. This section reviews flight delay causes and effects, machine learning-based forecasting models, and performance enhancement solutions. It emphasizes the necessity for resilient, dynamic, and generalizable solutions across application settings due to scalability, interpretability, and real-time adaptation limits.

2.1 Flight Delays in Civil Aviation: Causes and Impacts

Shorrock et al. [15] intended Air Traffic Control (ATC) human errors to be identified and analyzed using the TRACEr cognitive error categorization algorithm and structured taxonomy framework. Expert knowledge and structured interviews are utilized for retrospective and predictive error analysis, rather than machine learning models. The collection focuses on reduced separation minimum and includes real-world incident reports, controller interviews, and UK ATC operational observations. TRACEr identified cognitive weaknesses and human–system interaction concerns, making operational planning safer. Its reliance on subjective expert evaluation makes it unsuitable for automated or large-scale deployment.

Kontogiannis et al. [16] applied cognitive science principles to enhance human performance, error detection, and resilience without relying on an algorithm. Literature synthesis, psychological theory, and observational insights from high-stakes industries, such as aviation and healthcare, underpin the research. According to the framework, action, outcome, awareness, and planning-based mistake detection strategies can improve error management training. Since the tactics have not been evaluated in large-scale behavioral research or experimental trials, their main drawback is the lack of empirical validation.

Wu et al. [18] analyzed inherent and propagated airline schedule delays under stochastic operating conditions using a simulation-based modeling algorithm. The dataset comprises simulated flight operations, genuine delay records, and airline scheduling data. Results show that actual delays typically surpass inherent delays and that buffer time design improves schedule robustness. Modeled interruptions may not convey real-time operational complexity and unforeseen events.

Gao et al. [19] provide Quick Access Recorder data from different domestic airports to estimate when an arriving aircraft occupies the runway using a GA–PSO optimized Back Propagation neural network. The Shapley Additive explanation (SHAP) model determines how input parameters affect occupancy time. The suggested model accurately predicts, providing valuable insights into aircraft separation and enhancing runway design performance. Limitations include variability in airport data quality and limited adaptation to rare or extreme operational instances.

2.2 Machine Learning Approaches for Flight Delay Prediction

Li et al. [17] analyzed the ST-Random Forest model using spatial and temporal information to predict flight delays. LSTM captures airport congestion and weather trends, whereas complex network theory collects spatial and temporal characteristics from the aviation network at the edge, node, and network levels. The combined features are fed into a Random Forest classifier to predict delays. After training and testing on Chinese domestic flight data from June to August 2016, the model achieved 92.39% accuracy, 86% for on-time flights, and 95% for delayed flights. Performance is high, but historical data quality and real-time adaptation to unanticipated disturbances are issues.

Lu et al. [20] introduced a machine learning classification system that predicts flight delays with greater generalization using high-dimensional temporal and spatial information. Using historical flight records, it considers airport conditions, previous flight status, and route congestion. The model captured complex delay patterns with great accuracy and dependability on recent data. Model adaptation across regions and operational contexts may restrict its universal application.

Azam et al. [21] emphasized the simplicity, speed, and anomaly detection capabilities of the Decision Tree algorithm in intrusion detection systems (IDS). IDS performance is assessed using NSL-KDD, CICIDS2017, and UNSW-NB15 datasets. According to the research, decision trees have significant false-positive rates and struggle to detect sophisticated, evolving threats without the use of ensemble or hybrid upgrades.

Khodabandelou et al. [22] discussed a convolutional attention-based recurrent neural network (CARNN) that forecasts traffic speed using only traffic flow data, without previous speed or network graphs. The model was trained on data from millions of vehicles on various Greater Paris Road links (2016–2017). Capturing local-temporal and interdependencies influenced by non-free-flow situations yielded great accuracy. The model's main benefit is its low data consumption. Still, it may be less adaptable to diverse geographic contexts without retraining, especially in places with distinct traffic dynamics or infrastructure layouts.

Dalmau et al. [23] extend a machine learning model to improve aircraft takeoff time predictions from flight plan reception. The revised model reduces forecast error by 30% compared to the Enhanced Traffic Flow Management System, utilizing historical flight and operational data from the EUROCONTROL Maastricht UAC, and features optimized for real-world deployment. It interprets feature importance and interactions via additive feature attribution. Its inability to adjust to real-time obstacles, such as weather or runway closures, may degrade performance without dynamic updates.

2.3 Performance Enhancement Strategies and Decision Support Systems

Mizan et al. [24] introduced a unique Ensemble of Pruned Regressor Chain (EPRC) model and a multi-objective optimization method with a presents of three-phase solution framework to reduce patient waiting times in Canadian radiology departments. Using real-time patient-arrival data from a Canadian healthcare service, the model accurately predicts arrival volumes, missed target arrival times (TAT) rates, and workload demands. EPRC reduced patient waiting time by 8.17% and prediction accuracy by 10.81% with 25% less workload than other multi-target algorithms. The model's reliance on historical trends may hinder performance during unexpected healthcare shocks.

Park et al. [25] proposed a two-phase strategy to enhance business process resource allocation by utilizing machine learning-based predictive models and optimization algorithms. The design science algorithm aligns resource assignments with expected task deadlines to optimize total weighted completion time. The technique enhances task-resource matching efficiency and accuracy in real-world business process event logs, significantly reducing process delays and improving workload distribution. In highly dynamic contexts, prediction accuracy and task length variability limit the model's usefulness and the reliability of resource scheduling.

Seyyedabbasi et al. [26] propose three reinforcement learning (RL)-based hybrid metaheuristic algorithms that tackle challenging global optimization problems using Q-learning and classical optimization. A reward-penalty scheme and Q-tables help RL agents dynamically balance exploration and exploitation, enhancing search performance without relying on models. These methods are tested on 30 CEC 2014 and 2015 benchmark functions and applied to the inverse kinematics of a robotic arm. The optimization results are better than those of GWO and WOA. However, controlling Q-learning in large or real-time systems requires computing costs.

2.4 Estimated Time of Arrival (ETA) Prediction in Aviation

Silvestre et al. [27] proposed an in-depth education LSTM model that forecasts flight arrival timings based on destination weather and 4D trajectory information. LSTM neural networks at Madrid-Barajas airport process weather and spatio-temporal flight routes. With competitive short- and long-term forecasts over various horizons on 2022 flight data, the mean absolute error was 2.65 min. Limitations include weather data dependency, possible generalizability problems, and validation at a single airport.

Huang et al. [28] combined ensemble approaches, neural networks, and tree models to provide a novel ETA estimate solution for transportation systems. A hybrid ensemble strategy combines neural network topologies and tree-based algorithms for providing reliable arrival time prediction. The study achieved first place in the SIGSPATIAL 2021 GISCUP competition, demonstrating exceptional accuracy and resilience on A/B test datasets. Limitations include a limited examination of real-world deployment, potential overfitting to contest data, and validation specific to the competition.

Zheng et al. [29] suggested a data-light trajectory-based machine learning method that uses only latitude, longitude, and speed data to anticipate flight arrivals in real time. By comparing current flights with past trajectories, the technique uses gradient boosting machines for speed estimates and LSTM networks for trajectory prediction. Tests conducted on actual US flights demonstrated that this approach outperformed others while requiring less computing power and being straightforward to use for individuals with limited data access.

Table 1 presents research on aircraft delay prediction, traffic performance enhancement, intelligent resource allocation, machine learning, and hybrid algorithms. Aviation, healthcare, and transportation use neural networks, decision trees, and reinforcement learning. Each study's technique, dataset, results, and limitations are evaluated for scalability, real-time flexibility, and data dependency. Predictive models and performance enhancement algorithms are increasingly used for proactive decision-making and generalizable, interpretable, and computationally efficient solutions in dynamic operational circumstances.

Table 1 Overview of Key Research on Flight Delay and Performance Enhancement Models

Full size table

3 Architecture of FlightNet-ST

FlightNet-ST, a hybrid deep learning framework for accurate and interpretable flight delay prediction, is used in Fig. 1. Combining LSTM, GCNs, and 3D-CNNs integrate spatio-temporal and trajectory data. The model has four modules that teach dataset features. The dataset includes real-world domestic flight records with dynamic flight paths, operational parameters (airline, scheduled time), geographical data (origin–destination airports), and temporal features (day, day of the week).

3.1 Input Data Processing

FlightNet-ST relies on input data processing to prepare aviation data for deep learning. The dataset has chronological, spatial, and operational features. Timing inputs like $Year, Month, DayOfWee\text{k}$, and $CRSDepTime.$ are encoded into a matrix T $\in$${\mathbb{R}}^{\text{n}\times {\text{d}}_{\text{T}}},$ where $\text{n}$ is the number of flight samples and ${\text{d}}_{\text{T}}$ is known as the number of temporal features. This sequence data is passed on to a Long Short-Term Memory network ${\text{f}}_{\text{LSTM}}$, which models time-dependent delay patterns. Spatial information from airport attributes ($Origin, Dest, OriginState, DestState$) is structured as a $\text{G}=(\text{V},\text{ E})$, describing each airport node by a feature matrix.${\mathbf{X}}_{\mathbf{s}}\in {\mathbb{R}}^{{|\text{V}|\times \text{d}}_{\text{s}}}$, where $|\text{V}|$ is the number of airports and ${\text{d}}_{\text{s}}$ is known as the dimensionality of spatial descriptors. The corresponding adjacency matrix $\mathbf{A}\in {\mathbb{R}}^{|\text{V}|\times |\text{V}|}$, built using flight frequency and historical delay correlations, feeds into a Graph Convolutional Network ${\text{f}}_{\text{GCN}}$ to capture spatial relationships. Meanwhile, operational data such as $\text{Distance},\text{ AirTime}$, and $\text{ArrDelay}$ is transformed into a 3D grid tensor ${\mathbf{X}}_{\mathbf{t}}\in {\mathbb{R}}^{\text{h}\times \text{w}\times \text{t}}$ and processed through a 3D Convolutional Neural Network ${\text{f}}_{3\text{D}-\text{CNN}}$. These three modality-specific outputs are concatenated as a unified feature vector in Eq. 1:

$$\mathbf{F}={\text{f}}_{\text{LSTM}}\left(\text{T}\right)\oplus {\text{f}}_{\text{GCN}}\left(\text{A},{\text{X}}_{\text{s}}\right)\oplus {\text{f}}_{3\text{D}-\text{CNN}}\left({\text{X}}_{\text{t}}\right).$$

(1)

An attention layer $\alpha \left( F \right)$ assigns interpretability by weighting key features before the final output layer applies learned weights $\text{W}$ and an activation function $\upsigma$, producing the predicted flight delay label $\widehat{\text{y}}$ expressed in Eq. 2:

$$\hat{y} = \sigma \left( {W^{T} . \alpha \left( F \right)} \right).$$

(2)

This structured data processing ensures the model captures the intricate dependencies in air travel dynamics, enabling accurate and interpretable predictions of the target variable $\text{is}\_\text{delay}$.

3.1.1 Spatio-Temporal Analysis of Traffic Congestion Index

Air traffic congestion must be statistically modeled in spatial and temporal dimensions and encoded into the FlightNet-ST graph and grid-based inputs to integrate the Congestion Index Calculation. The goal is to quantify local and network-wide traffic strains that cause delays. This can be accomplished in a coherent and accurate mathematical manner. The Congestion Index (CI) measures operational congestion in an airport at time $\text{t}$, represented as $\text{CI}({\text{a}}_{\text{i}},\text{t})$. It quantifies its impact on airport performance and the risk of delay. It incorporates numerous parameters to reflect traffic demand and historical delay patterns. The index considers scheduled traffic volume, including departures ${\uplambda }_{\text{dep}}({\text{a}}_{\text{i}},\text{t})$ and arrivals ${\uplambda }_{\text{arr}}({\text{a}}_{\text{i}},\text{t})$ relative to the scheduled traffic volume. The CI considers current load and historical delay propagation as a term $\emptyset \left( {a_{i} ,t} \right)$ indicating the average delay at airport $a$ throughout the same timeframe. This concept models systemic congestion. Additionally, the average taxiing or holding time ${\uptau }_{\text{avg}}({\text{a}}_{\text{i}},\text{t})$ captures queuing characteristics that may not be reflected in schedule data alone. The decay factor β ∈ [0,1] is used to provide temporal smoothness and consider the impact of recent congestion history on the current calculation. The congestion index combines these characteristics to provide a time-sensitive depiction of airport saturation that can be dynamically incorporated into delay prediction algorithms, thereby enhancing spatial–temporal contextual awareness. The Congestion Index $\text{CI}({\text{a}}_{\text{i}},\text{t})$ as a weighted function that captures multidimensional congestion influences in Eq. 3:

$$CI\left( {a_{i} ,t} \right) = \left( {\frac{{\lambda_{dep} \left( {a_{i} ,t} \right) + \lambda_{arr} \left( {a_{i} ,t} \right)}}{{CI_{\max } \left( {a_{i} } \right)}}} \right)^{\gamma } + \beta .\emptyset \left( {a_{i} ,t} \right) + \left( {1 - \beta } \right).\tau_{avg} \left( {a_{i} ,t} \right).$$

(3)

The first term models saturation ratio (traffic vs. capacity), the second term accounts for historical delay propagation, and the third reflects real-time queue dynamics,$\upgamma>1$: exponential scaling factor emphasizing capacity overflow scenarios, ${\text{CI}}_{max}({\text{a}}_{\text{i}})$, referring to the maximum capacity or normalization factor for airport $a_{i}$. This index can be normalized across the dataset in Eq. 4:

$$CI_{norm} \left( {a_{i} ,t} \right) = \frac{{CI\left( {a_{i} ,t} \right) - \mu CI}}{\sigma CI}.$$

(4)

In Equations 4,$\mu CI$ and $\sigma CI$ are the mean and standard deviation of congestion values across all airports and time windows. To improve FlightNet-ST's spatial and temporal learning capacity, the normalized Congestion Index norm ${\text{CI}}_{\text{norm}}\left({\text{a}}_{\text{i}},\text{t}\right)$ is applied to several model stages. The GCN module adds the ${\text{CI}}_{\text{norm}}\left({\text{a}}_{\text{i}},\text{t}\right)$ as an additional node feature in the spatial feature matrix. ${\mathbf{X}}_{\mathbf{s}}\in {\mathbb{R}}^{{|\text{V}|\times \text{d}}_{\text{s}}}$, giving each airport node a congestion-aware temporal signal that informs network structure. Congestion values are projected onto the trajectory grid tensor.${\mathbf{X}}_{\mathbf{t}}\in {\mathbb{R}}^{\text{h}\times \text{w}\times \text{t}}$ as an extra feature channel in the 3D-CNN module, the model can learn spatio-temporal congestion patterns across aircraft routes. The Congestion Index is included in the unified feature vector $\text{F}$, allowing $\alpha \left( F \right)$ to prioritize congestion-related cues for flight delay prediction dynamically. This integration captures the impacts of geographical and temporal congestion across the prediction process.

3.1.2 Spatial Graph Modeling of Airport Congestion in FlightNet-ST

The FlightNet-ST civil approach to flight delay prediction in the aviation industry uses an airport-level congestion-aware graph structure to capture spatial information, as shown in Fig. 2. Airports and their operational zones are modeled as nodes, forming a directed graph $\text{G}=(\text{V},\text{ E})$, where $\text{V}=\left\{{\text{v}}_{1},{\text{v}}_{2},\dots ,{\text{v}}_{3}\right\}.$ The adjacency matrix $\mathbf{A}\in {\mathbb{R}}^{|\text{V}|\times |\text{V}|}$ displays binary associations between nodes based on aircraft movement sequences ${\text{A}}_{\text{ij}}=1$, A direct path from segment ${\text{v}}_{\text{i}}\text{ to}{\text{v}}_{\text{j}}.$ This spatial graph structure is essential for understanding airport delay propagation across connected regions.

Each node ${\text{a}}_{\text{i}}$ is coupled with a normalized congestion index ${\text{CI}}_{\text{norm}}\left(,\text{t}\right)$ to reflect delay accumulation and traffic density with time. Our index uses real flight operational data from the dataset, including $\text{CRSDepTime},\text{ ArrDelayMinutes},\text{ AirTime}$, and origin–destination pairs. The normalized congestion score is incorporated into the node feature matrix: ${\mathbf{X}}_{\mathbf{s}}\in {\mathbb{R}}^{{|\text{V}|\times \text{d}}_{\text{s}}}$

In this equation, ${\text{d}}_{\text{s}}$ represents the number of spatial features, such as ${\text{CI}}_{\text{norm}}$, runway usage frequency, and segment delay ratio. Historical congestion patterns are piled into temporal sequences as model inputs. Spatial–temporal input is given in Eq. 5:

$$\left\{{\text{X}}_{\text{s}}^{\text{t}-\text{n}},{\text{X}}_{\text{s}}^{\text{t}-\text{n}+1},\dots ,{\text{X}}_{\text{s}}^{\text{t}}\right\}\in {\mathbb{R}}^{{\left|\text{V}\right|\times \text{d}}_{\text{s}}\times \text{n}}.$$

(5)

In Equation 5,$\text{n}$ represents the historical time window length. Graph convolutions on temporal slices teach the Graph Convolutional Network (GCN) spatial dependencies and delay propagation across airport structures. The FlightNet-ST pipeline supports downstream LSTM and 3D-CNN modules with spatial topology and congestion-aware node embeddings. FlightNet-ST reliably predicts flight delays by modeling temporal sequences, spatial connection, and route-based delay propagation.

3.1.3 Temporal Feature Construction in FlightNet-ST

FlightNet-ST constructs a Time Feature (TF) vector sequence to simulate flight delay based on chronological flight operations and environmental parameters in Fig. 3. This vector goes into the Long Short-Term Memory (LSTM) module to track delay propagation patterns.$FlightDate, DayOfWeek$, and $CRSDepTime$ index records chronologically.

One-hot encoding creates high-dimensional binary vectors that maintain airline, airport, and route information from categorical variables like $Reporting\_Airline, Origin, Dest$, and $\text{DistanceGroup}$. Quantitative delay signals, such as $ArrDelayMinutes$ and $AirTime$, are included to capture past performance and flight duration trends.

Weather conditions are inserted into time series sequences as binary (0–1) characteristics to model environmental disruptions. Weather-induced disruption criteria have been updated to reflect practical aviation thresholds from domestic US operations, which are discussed in Table 2.

Table 2 Weather-Induced Binary Indicators for Flight Delay Prediction in FlightNet-ST

Full size table

The time series vector ${\mathbf{X}}_{\mathbf{t}}\in {\mathbb{R}}^{\text{n}\times {\text{d}}_{\text{T}}}$ integrates environmental indicators. Each row shows a timestamped operational-meteorological fusion. The LSTM layer learns how temporal dependencies and interruptions affect delay risks. FlightNet-ST's temporal module represents flight delay factors robustly and time sensitively using multiple sequential data dimensions. The scheduled time features $CRSDepTime, DayOfWeek, Month,$ and $FlightDate$ record operating schedules and seasonal cycles. $ArrDelayMinutes$ and $AirTime$ show flight efficiency and delay trends: one-hot encrypts $Reporting\_Airline, Origin$, and $Dest$ to protect aircraft route and operator identities. Notably, the model incorporates weather-related binary indicators based on altered threshold conditions to reflect real-world environmental disruptions, such as storms, fog, snow, and low visibility, that can hamper operations. These pieces give the LSTM layer a complete and temporally dynamic feature sequence, allowing the model to learn routine and aberrant delay patterns induced by time-dependent operational and environmental variables.

3.2 Temporal Delay Pattern Modeling with LSTM in FlightNet-ST

An LSTM network extracts temporal delay characteristics in FlightNet-ST, which is ideal for simulating time-dependent processes like flight schedule delay buildup. Flight-specific information (e.g., scheduled departure time, actual arrival time), operational records (e.g., historical delays, airtime), and environmental conditions (e.g., severe weather indications) are time series inputs to the model. These sequential inputs allow the LSTM to capture flight operations' regular and extraordinary temporal patterns.

LSTM modules govern information flow over time using gated architecture. It has three main gates—forget, input, and output—and two key components—cell state and hidden state. At time step$(\text{t})$, the network receives current feature values${\text{X}}_{(\text{t})}$, past hidden states ${\text{h}}_{(\text{t})-1}$, and previous cell states${\text{c}}_{(\text{t})-1}$.The internal computations of the LSTM are as follows:

(${\text{f}}_{(\text{t})}$): Determines which part of the prior cell state (${\text{c}}_{(\text{t})-1}$) should be kept.

Remove extraneous past delay or weather data from the existing models. Flight delay patterns require long-term memory management.

$$f_{\left( t \right)} = \rho \left( {{\mathbb{W}}_{f} } \right) \cdot \left[ {l_{\left( t \right) - 1} ,x_{\left( t \right)} } \right] + b_{\left( f \right)} .$$

(6)

(${\mathbf{i}}_{(\text{t})}$): Sets how much new input (${\text{x}}_{\text{t}}$) is written to Memory. Helpful in adding flight status, weather, and scheduling data.Works with candidate cell state to update Memory.

$$i_{\left( t \right)} = \rho \left( {{\mathbb{W}}_{i} } \right) \cdot \left[ {l_{\left( t \right) - 1} ,x_{\left( t \right)} } \right] + b_{\left( i \right)} .$$

(7)

$({\widetilde{\mathbf{C}}}_{(\text{t})})$: Generates potential new content that can be added to the state of the cell. The information relevant to the delay is encoded from the current time step and is ready to be included selectively through the input gate after preparation.

$${\tilde{\mathbf{C}}}_{{\left( {\text{t}} \right)}} = {\text{tanh}}\left( {{\mathbb{W}}_{{\text{c}}} } \right) \cdot \left[ {{\text{l}}_{{\left( {\text{t}} \right) - 1}} ,{\text{x}}_{{\left( {\text{t}} \right)}} } \right] + {\text{b}}_{{\left( {\text{c}} \right)}} .$$

(8)

(${\mathbf{C}}_{(\text{t})}$): Memory from the past that has been retained and knowledge that has been selected. The "memory line" of delays is preserved, allowing for the maintenance of beneficial patterns. The LSTM's ability to learn from long-term sequences is essential to its operation.

$$C_{\left( t \right)} = f_{\left( t \right)} \odot C_{\left( t \right) - 1} + i_{\left( t \right)} \odot \tilde{C}_{\left( t \right)} .$$

(9)

(${\mathbf{o}}_{(\text{t})},{\mathbf{h}}_{(\text{t})}$): To determine which portion of the modified Memory should be output. The hidden state is transferred to the next layer in the subsequent time step or prediction layer. Necessary for the transmission of learned delay patterns within the FlightNet-ST software.

$$o_{\left( t \right)} = \rho \left( {{\mathbb{W}}_{o} } \right) \cdot \left[ {l_{\left( t \right) - 1} ,x_{\left( t \right)} } \right] + b_{\left( o \right)} ,$$

(10)

$$h_{\left( t \right)} = o_{\left( t \right)} \Theta \tan l\left( {C_{\left( t \right)} } \right),$$

(11)

where Eqs. 6–11 is expressed as ${\text{x}}_{(\text{t})}$: Input vector at time step $(\text{t})$ (e.g., features like $\text{CRSDepTime},\text{ ArrDelayMinutes}$), ${\text{h}}_{(\text{t})}$: Hidden state at time step $(\text{t})$, ${\mathbf{C}}_{(\text{t})}$: Cell state (Memory) at time $(\text{t})$,${\widetilde{\mathbf{C}}}_{(\text{t})}$: Candidate values to add to the cell state, ${\mathbf{f}}_{(\text{t})}$,${\text{i}}_{(\text{t})}$,${\text{o}}_{(\text{t})}$: Forget, input, and output gates respectively, ${\mathbb{W}}_{\text{f}},{\mathbb{W}}_{\text{i}},{\mathbb{W}}_{\text{c}},{\mathbb{W}}_{\text{o}}$: Weight matrices for each gate, ${\text{b}}_{\text{f}},{\text{b}}_{\text{i}},{\text{b}}_{\text{c}},{\text{b}}_{\text{o}}$: Bias vectors, $\rho$: Sigmoid activation function,$\text{tanl}$: Hyperbolic tangent activation function, and Θ: Element-wise (Hadamard) product. FlightNet-ST can learn delayed advancement with this temporal extraction module. The model accurately incorporates chronic patterns (e.g., weekday rush hours) and unexpected anomalies (e.g., weather-induced disruptions) using LSTM's Memory, improving delay forecasting and flight scheduling.

3.3 Spatial Delay Pattern Modeling with GCN in FlightNet-ST

FlightNet-ST's second module utilizes Graph Convolutional Networks to capture the spatial dependencies within the air traffic network. Origin–destination linkages between hubs influence the propagation of civil aviation delays. The domestic flight network is graphed to model this inter-airport influence. Nodes represent airports, and edges represent direct flight routes.

A major hub delay can cascade to adjacent airports in this geographical graph topology. The GCN models this by processing an adjacency matrix $\widetilde{\text{A}}=\text{A}+\text{I}$, where $\text{A}$ represents airport connectivity, and $\text{I}$ represent self-loops to retain node attributes. The feature matrix ${\text{X}}_{\text{t}}$ encodes delay-relevant features for each airport, such as past congestion levels and scheduled flights. The GCN replication condition is to cross-connect nodes to extract spatial characteristics at a high level in Eq. 12:

$${\mathcal{C}}^{{\left( {l + 1} \right)}} = \sigma \left( {\widetilde{{\mathbb{E}}}^{ - 1/2} \tilde{A}\widetilde{{\mathbb{E}}}^{ - 1/2} {\mathcal{C}}^{\left( l \right)} \theta^{\left( l \right)} } \right),$$

(12)

where $\widetilde{\text{A}}$ is the adjacency matrix with added self-connections,$\widetilde{\mathbb{E}}$ is the degree matrix of $\widetilde{\text{A}}$, with ${\widetilde{\text{D}}}_{\text{ij}}={\sum }_{\text{j}}{\widetilde{\text{A}}}_{\text{ij}}$,${\mathcal{C}}^{(\text{l})}$ is the output of the $\text{l}$ th GCN layer, with ${\mathcal{C}}^{(0)}$=${\text{X}}_{\text{t}}$,$\theta^{{\left( {\text{l}} \right)}}$ is the Weight vector accessible for layer $\text{l}$,$\upsigma (\cdot )$ is a non-linear activation function (e.g., Sigmoid).

This formulation aggregates information from the graph's first-order neighborhood to teach the model how a given airport's condition affects its neighbors. To discover systemic congestion patterns and delay propagation channels, FlightNet-ST stacks multiple GCN layers to capture deeper spatial interactions across the network. This module produces a geographical embedding of air network congestion. Joint delay prediction utilizes this embedding, incorporating temporal and route-based characteristics.

3.4 Dynamic Trajectory Delay Pattern Modeling with 3D-CNN in FlightNet-ST

A 3D model of real-time flight routes is used to infer spatio-temporal dependencies from aircraft trajectory data. Unlike route-based data, this model can detect route-level congestion, en-route delays, and spatial abnormalities during flight operations because it accounts for aircraft movement throughout time and space. The input of 3D-CNN is a spatio-temporal trajectory grid of the dynamic flight path from B to A. That is, the flight path is represented as a 3D volume in size: $latitude \times longitude \times time$. Real-time data, such as aircraft position, speed, altitude, wind vector, and weather interference (including turbulence and storms), are stored in each voxel of the grid. This enables the CNN to learn spatio-temporal patterns of motion along the flight path, which are highly relevant in predicting en-route delay. This component encodes flight trajectories into 3D volumetric grids, representing latitude (ϕ), longitude (λ), and time (t). Within controlled airspace, these grids track aircraft spatio-temporal movement. Each grid voxel contains normalized dynamic data, such as aircraft presence indicators, velocity vector components (e.g., ground speed, vertical rate), and meteorological conditions, including wind shear, turbulence, and convective weather zones. This spatio-temporal cube feeds the 3D convolutional layers, allowing the model to learn rich movement patterns, route-level congestion, and real-time delay-inducing anomalies across space and time. A 3D Convolutional Neural Network (3D-CNN) captures local and global airspace aircraft movement patterns. It helps with the framework to track route congestion and en-route disturbances throughout time and space. Single 3D convolutional layer computation is given in Eq. 13:

$${\text{Y}}_{\text{i},\text{j},\text{k}}^{(\text{l})}=\upsigma \left(\sum_{\text{m}=1}^{{\text{C}}_{\text{l}-\text{i}}}\sum_{\text{u}=0}^{\text{d}-1}\sum_{\text{v}=0}^{\text{h}-1}\sum_{\text{w}=0}^{\text{w}-1}{\text{W}}_{\text{m}}^{\left(\text{l}\right)}\left(\text{u},\text{v},\text{w}\right). {\text{X}}_{\text{i}+\text{u},\text{j}+\text{v},\text{k}}^{\text{m}}+{\text{d}}^{\left(\text{l}\right)}\right),$$

(13)

where ${\text{X}}^{(\text{m})}$: Input volume from the ${\text{m}}^{\text{th}}$ channel at layer $\text{l}-1$, ${\text{W}}^{(\text{l})}$: 3D kernel (filter) of shape $\text{d}\times \text{h}\times \text{w}$ applied to input, ${\text{Y}}^{(\text{l})}$: Output feature map at layer $\text{l}$,${\text{b}}^{(\text{l})}$: Bias term, $\upsigma$: Non-linear activation function (e.g., ReLU), and $\text{i},\text{j},\text{k}$: Indices for spatial–temporal positions in the output volume. The 3D-CNN module produces a compressed feature volume containing localized flight movement spatio-temporal fluctuations. The spatio-temporal representation for final delay prediction is created by fusing this 3D trajectory embedding with LSTM and GCN module outputs.

3.5 Feature Fusion and Attention-Driven Delay Prediction

The FlightNet-ST architecture's final stage merges the heterogeneous outputs of the three specialized modules—LSTM (temporal dependencies), GCN (spatial inter-airport relationships), and 3D-CNN (dynamic flight trajectory patterns)—into a unified representation that drives delay prediction. This module accepts high-dimensional feature embeddings:

The LSTM module chronologically encodes flight schedules, past delays, and weather changes.
The GCN module learns how airport network disruptions spread spatially.
The 3D-CNN module highlights route congestion and airborne abnormalities using spatio-temporal flight path dynamics signals.

These features are concatenated into a vector and processed through a fully connected neural network (FCNN) that utilizes an attention mechanism to enhance model transparency and interpretability. The attention layer weights each contributing information or module to help the model focus on the most critical factors (e.g., airport congestion, route turbulence, departure time). This fusion module produces a continuous delay duration prediction (regression) or a delay classification label (delayed or on-time). The attention mechanism enhances prediction performance and interpretability, enabling stakeholders to identify which features or modules contributed to the expected delay. The integrated decision framework provides robust, explainable, and accurate flight delay forecasts for real-time operations.

3.6 FlightNet-ST Performance Enhancement and Evaluation

Depending on the prediction target, the FlightNet-ST model uses loss functions tuned for regression and classification tasks to make accurate delay predictions. The model minimizes the Mean Absolute Error (MAE) to estimate delay duration (regression), which is resilient to outliers and interpretable in minutes in Eq. 14:

$$\text{MAE}=\frac{1}{\text{N}}\sum_{\text{i}=1}^{\text{N}}|{\text{y}}_{\text{i}}-{\widehat{\text{y}}}_{\text{i}}|$$

(14)

where ${\text{y}}_{\text{i}}$ is the real-time latency,${\widehat{\text{y}}}_{\text{i}}$ is represented as the anticipated delay, and N is the total sample amount. Also, the Root Mean Square Error (RMSE) measures model sensitivity to greater errors in Eq. 15:

$$\text{RMSE}=\sqrt{\frac{1}{\text{N}}\sum_{\text{i}=1}^{\text{N}}{({\text{y}}_{\text{i}}-{\widehat{\text{y}}}_{\text{i}})}^{2}.}$$

(15)

For delay categorization (e.g., delayed vs. on time), the model uses Categorical Cross-Entropy Loss in Eq. 16:

$${\mathcal{L}}_{CE} = - \mathop \sum \limits_{i = 1}^{N} \mathop \sum \limits_{c = 1}^{c} y_{i}^{\left( c \right)} \log \hat{y}_{i}^{\left( c \right)} ,$$

(16)

where ${\text{y}}_{\text{i}}^{(\text{c})}$ is the ground truth binary indicator for class $\text{c}$,${\widehat{\text{y}}}_{\text{i}}^{(\text{c})}$ chances that class c will be represented, and C represents the total number of classes. The Adam optimizer updates network weights for efficient convergence via adaptive moment estimation in Eq. 17:

$$\begin{aligned} \raisebox{-4pt}{{\sf J}}\kern-3.5pt{\sf N}\end{aligned}{{k + 1}} = \begin{aligned} \raisebox{-4pt}{{\sf J}}\kern-3.5pt{\sf N}\end{aligned}{t} - \alpha .\frac{{\hat{g}_{t} }}{{\sqrt {\hat{s}_{t} + \delta } }},$$

(17)

where $\begin{aligned} \raisebox{-4pt}{{\sf J}}\kern-3.5pt{\sf N}\end{aligned}$ represents model parameters, $\alpha$ is the speed of learning, and $\hat{g}_{t} ,\hat{s}_{t}$ are first- and second-moment estimations with adjustments for bias. Ablation studies demonstrate that FlightNet-ST outperforms baseline models, including LSTM, GCN, 3D-CNN, and classic machine learning techniques, in terms of MAE reduction.

Data skewness and class-skewed delay distributions are prevalent in in-flight delay data, where most of the time, there are more on-time flights than delayed flights. The model alleviates this issue by employing training steps, such as stratified sampling, to maintain uniform class distribution across the training, validation, and test sets. In addition, weighted loss functions were also used in the classification task to penalize more severely misclassification of the minority (delay) class, thus increasing sensitivity to rare but essential delay cases. The technique ensures that the model learns effectively from imbalanced data and keeps having strong performance across both frequent and infrequent classes. The research design has also been robustified to align with this added methodological refinement.

4 Experimental Results and Insights

4.1 Data Source Information

This dataset shows historical on-time performance for domestic US flights at major airports. Each row contains the date, airline, Origin, destination airports, departure time (CRSDepTime), distance, arrival delay, and the binary is_delay label indicating whether the aircraft was delayed, as shown in Table 3. This dataset helps train machine learning algorithms to predict weather and operational flight delays, including temporal ($\text{Month},\text{ DayOfWeek}$), spatial (Origin, Dest), and operational aspects. Using LSTM, GCN, and 3D-CNN in the FlightNet-ST framework to predict flight delays would improve the booking platform user experience.

Table 3 Categorization and Description of Flight Delay Dataset Attributes

Full size table

Figure 4 shows the US Domestic Flights Delay Prediction (2013–2018), which shows flight delay patterns over time and months. Severe delays are color coded to indicate operational impact. During peak traffic hours, delays peak between 9:00 AM and 6:00 PM.

The winter months—October, November, and December—have the highest delay ratings, indicating that weather and traffic congestion are factors. Airport operations are complicated; therefore, delay numbers are variable and volatile. Vertically, delays cluster within specified daytime hours, indicating short-term propagation and frequent triggering. Horizontally, the graph shows a daily pattern of high-delay instances simultaneously across days. Horizontal red streaks indicate a recurring structural issue or congestion at the airport. This visual understanding of delay propagation's temporal and spatial properties aids flight scheduling, resource planning, predictive modeling, and performance enhancement.

Figure 5 shows how weather affects monthly flight delays. The Z-axis represents flight delay frequency, the X-axis shows months from Jan to Dec, and the Y-axis shows weather conditions (Clear, Rain, Snow, Fog, Thunderstorm). This image is essential for the delay prediction process for data investigation. The temporal weather correlation helps the model (like the proposed FlightNet-ST) learn how seasonal weather patterns (e.g., summer thunderstorms or winter fog) cause delays. The colorful bars enhance interpretability by showing the contribution of each weather type over the months.

For instance,

winter impacts are indicated by December–February high “Snow” bars;
summer disruption trends are confirmed by June–August “Thunderstorm” spikes.

These insights enhance the spatio-temporal learning component of the model by showing when and where delays are most likely, improving aircraft scheduling and management forecasts and decisions.

Figure 6 illustrates the taxiway segments at the airport and their congestion index, expressed as minutes per kilometer. A higher index suggests more segment delays. Red segments P4-P5, P5-P6, and P6-36L are most congested. It will probably delay hotspots. Green segments P1-P2 and P8-P9 have low congestion. These data help estimate ground delays in your FlightNet-ST model research. It enables the GCN model to comprehend how connected taxiways contribute to the spread of delays. An LSTM may learn how delays change over time. A 3D-CNN can recognize patterns across space and time. These data can improve airport operations and flight delay prediction in your model.

5 Model Development and Evaluation

Figure 7 shows how optimizers and learning rates impact the FlightNet-ST model for civil aviation flight delay prediction. The line plot shows how the model's loss lowers during training epochs using the Adam optimizer. Lower rates, such as 0.0001 and 0.0005, provide steady convergence, whereas larger rates, like 0.01, drop initial loss faster but risk instability. In aircraft delay prediction, consistent and dependable performance requires careful selection of the learning rate. The heatmap shows that Adam, with a learning rate of 0.001, consistently has the lowest losses, outperforming Adadelta and SGD. These findings support the integration of LSTM, GCN, and 3D-CNN architectures in FlightNet-ST, demonstrating that optimizing the model's parameters and hyperparameters enhances its ability to capture complex spatio-temporal patterns in flight operations. It boosts forecast accuracy, decreases training mistakes, and enables real time.

5.1 Flight Delay Prediction Based on the FlightNet-ST

MAE, Mean Absolute Percentage Error (MAPE), and RMSE are used to evaluate machine learning models, such as FlightNet-ST, that predict flight delays in Table 4. MAE measures the average absolute difference between expected and actual delay values, indicating prediction accuracy. MAPE's prediction error as a % makes comparing the model's performance to actual values easier and helps compare prediction mistakes in different scenarios. Due to its squared structure, RMSE penalizes more significant mistakes more harshly, making it sensitive to outliers and effective in situations where huge errors are unacceptable. These metrics evaluate flight delay prediction models, such as FlightNet-ST, which utilizes LSTM, GCN, and 3D-CNN for processing spatio-temporal flight data. By comparing the MAE, MAPE, and RMSE values of LSTM, GCN, and traditional machine learning models, researchers can determine which model best predicts flight delays while minimizing errors and optimizing scheduling, resource allocation, and operations to enhance airline efficiency and passenger satisfaction.

Table 4 MAE, MAPE, and RMSE of Different Methods in Flight Delay Prediction

Full size table

Flight datasets of various temporal scales and data types were utilized to assess the robustness of the FlightNet-ST model in real-world applications. Historical flight records were divided into monthly, quarterly, semi-annual, and annual segments to reflect data volumes and operational patterns. The comparative analysis shows that FlightNet-ST has strong predictive consistency across all configurations. The model consistently exhibited Mean Absolute Error (MAE) values below 0.6, demonstrating good adaptability and stability, regardless of changes in dataset size or feature dimensionality. As the data scale increased, performance improvements were more significant, demonstrating the model's ability to utilize large datasets for accuracy. FlightNet-ST outperformed a 3D CNN with GCN and LSTM in MAE, RMSE, and MAPE. These results confirm the model's scalability and dependability in flight delay prediction, especially in complicated air traffic scenarios. FlightNet-ST is ideal for civil aviation operations with variable data volumes, seasonality, and routing patterns due to its consistent accuracy throughout temporal aggregations.

Figure 8 shows the MAE and RMSE monthly performance of the proposed FlightNet-ST model. December had the lowest forecast error (MAE 11.0, RMSE 14.0), while May and October had higher errors due to seasonal fluctuation. The difference between MAE and RMSE indicates sustained performance over a period of months. These findings confirm FlightNet-ST's ability to capture spatio-temporal delay patterns, making it a reliable and interpretable civil prediction model for aviation flight delays.

Figure 9 shows the comparison of FlightNet-ST's MAPE to a baseline GCN-LSTM model across all months. FlightNet-ST has lower MAPE values, especially in summer and winter, indicating improved seasonal delay flexibility. The hybrid model better captures temporal, spatial, and dynamic flight behavior due to its lower error margin. The research aims to enhance civil aviation delay prediction using a robust, interpretable, and data-driven machine learning system.

Figure 10 shows FlightNet-ST and GCN-LSTM's quarterly predictive performance by comparing their MAE and RMSE throughout four quarters. FlightNet-ST reliably predicts flight delays with lower MAE and RMSE than GCN-LSTM year-round. Employing the hybrid FlightNet-ST model, it supports the spatio-temporal learning technique, which blends LSTM, GCN, and 3D-CNN components to improve delay prediction.

Figure 11 shows the comparison between MAPE semi-annually and annually for the proposed FlightNet-ST model and GCN-LSTM. FlightNet-ST outperforms GCN-LSTM with lower MAPE values: 0.37 (Jan–Jun), 0.35 (Jul–Dec), and 0.36 (Annual), which proves that FlightNet-ST can predict flight delays year-round. The model enhances prediction by utilizing LSTM for temporal dependencies, 3D-CNN for spatial patterns, and GCN for airport network structures. These results demonstrate the model's potential to enable real-time decision-making in civil aviation, thereby reducing delays and improving efficiency.

MAE and RMSE prediction errors for the proposed FlightNet-ST and baseline GCN-LSTM models are compared runway-wise in Fig. 12. FlightNet-ST consistently has fewer prediction errors on all runways (R12, R13, R23, R123), demonstrating its robustness and accuracy in varied runway settings. FlightNet-ST reduces MAE and RMSE, notably on complicated runways like R12, where RMSE drops significantly compared to GCN-LSTM, which shows that merging LSTM, 3D-CNN, and GCN captures spatio-temporal and topological patterns crucial to flight delay prediction, enabling runway-specific operational decisions.

5.2 Binary Classification Metrics for Delay Prediction

Besides regression-based metrics, this work also included binary classification metrics to estimate the model's performance in predicting whether a flight would be delayed or not. Accuracy, precision, recall, and F1 Score are some of these metrics that provide a more interpretable representation of the model's classification strength. Accuracy estimates overall accuracy, and precision estimates the proportion of correctly predicted delays. Recall is the ratio of correct delays correctly predicted, and F1 Score is a combination of precision and recall according to their harmonic mean. The equations below are the respective ones:

$Accuracy = \left(\frac{\left(TP + TN\right)}{\left(TP + TN + FP + FN\right)}\right),$ $Precision = \frac{TP}{\left(TP + FP\right)},$ $Recall =\frac{ TP}{\left(TP + FN\right)},$ and $F1 Score = 2 \times \left(\frac{\left(Precision \times Recall\right)}{\left(Precision + Recall\right)}\right)$, where $TP, TN, FP, and FN$ are true positives (correctly predicted delays), true negatives (correctly predicted on-time flights), false positives (incorrectly predicted delays), and false negatives (missed delay predictions), respectively.

This study also evaluated the FlightNet-ST model against some baseline methods. From Table 5, one can see that FlightNet-ST performed best on all the measurements—Accuracy (94.6%), precision (93.8%), recall (92.4%), and F1 Score (93.1%)—against LSTM, GCN, 3D-CNN, Random Forest, and XGBoost. The findings confirm unambiguously that integrating spatio-temporal, topological, and trajectory data, along with the use of the attention mechanism, significantly enhanced the accuracy of the model in binary delay forecasting.

Table 5 Binary Classification Performance of FlightNet-ST and Baseline Methods

Full size table

To evaluate the models, a comparative study was conducted using both traditional models and novel state-of-the-art deep learning approaches for flight delay prediction. The performance of the developed FlightNet-ST model was compared to widely used traditional algorithms (Random Forest, XGBoost) and contemporary hybrid architectures (LSTM-GCN, Attn-LSTM, and ST-GAT). Main regression and classification metrics were used for comparison.

As shown in Table 6, the proposed FlightNet-ST model consistently outperforms all test metrics compared to conventional and current state-of-the-art deep learning baselines, making it superior in predicting flight delays accurately and classifying delay occurrences.

Table 6 Comparative Performance of FlightNet-ST and State-of-the-Art Methods

Full size table

6 Conclusion and Future Enhancement

FlightNet-ST, a robust model for civil aviation flight delay prediction, is presented in this work. The approach incorporates the temporal and spatial dependencies of complicated flight data by merging Long Short-Term Memory, Graph Convolutional, and 3D Convolutional Neural Networks. FlightNet-ST outperforms traditional machine learning and hybrid deep learning baselines in predictive performance, utilizing a rich dataset of operational, temporal, and geographical features, including scheduled departure times, flight distance, airport locations, and carrier information. Experimental results reveal significant improvements in MAE, RMSE, and MAPE, along with consistent accuracy across datasets of varying scales. The approach's performance under diverse data volumes supports its use in aviation. The architecture's attention mechanism helps stakeholders identify important delay issues, such as departure time limits and high-traffic airports.

Several meaningful directions can be explored with this research. First, adding real-time meteorological data, air traffic congestion, and airline staff scheduling constraints could improve model accuracy and operational relevance. Second, supporting multi-label classification—such as delay cause identification—would give airport and airline operators more actionable knowledge. Third, modifying the model for international flights may entail handling multiple data formats and regulatory contexts, a fascinating task. Finally, applying the model in a real-time prediction system, coupled with airline scheduling tools, could enable dynamic re-routing and proactive passenger communication, thereby boosting aviation efficiency and passenger satisfaction.

Data Availability

All data generated or analyzed during this study are included in this article.

References

Bubalo, B., Gaggero, A.A.: Flight delays in European airline networks. Res. Transp. Bus. Manag. 41, 100631 (2021)
Google Scholar
Cheevachaipimol, W., Teinwan, B., Chutima, P.: Flight delay prediction using a hybrid deep learning method. Eng. J. 25(8), 99–112 (2021)
Article Google Scholar
Zhu, X., Li, L.: Flight time prediction for fuel loading decisions with a deep learning approach. Transp. Res. Part C: Emerg. Technol. 128, 103179 (2021)
Article Google Scholar
Anguita, J.M., Olariaga, O.D.: Prediction of departure flight delays through the use of predictive tools based on machine learning/deep learning algorithms. Aeronaut. J. 128(1319), 111–133 (2024)
Article Google Scholar
Alla, H., Moumoun, L., Balouki, Y.: A multilayer perceptron neural network with selective-data training for flight arrival delay prediction. Sci. Program. 2021(1), 5558918 (2021)
Google Scholar
Okwir, S., Amouzgar, K., Ng, A.H.: Exploring prediction accuracy for optimal taxi times in airport operations using various machine learning models. J. Air Transp. Manag. 122, 102684 (2025)
Article Google Scholar
Wang, X., Mou, R.: Flight safety risk prediction for civil aircraft approach and landing. J. Aerosp. Inf. Syst. 22(3), 220–230 (2025)
Google Scholar
Song, C., Ma, X., Ardizzone, C., Zhuang, J.: The adverse impact of flight delays on passenger satisfaction: An innovative prediction model utilizing wide & deep learning. J. Air Transp. Manag. 114, 102511 (2024)
Article Google Scholar
Yang, G. (2025). Fault prediction and reliability optimization of aerospace systems based on deep learning. Int. J. High Speed Electron. Syst., 2540365.
Alla, H., Balouki, Y.: Airport infrastructure and runway precision aids for forecasting flight arrival delays. Int. J. Electr. Comput. Eng. 15(1), 2088–8708 (2025)
Google Scholar
Xu, Y., Wandelt, S., Sun, X., Yang, Y., Jin, X., Karichery, S., Drwal, M.: Machine-learning-assisted optimization of aircraft trajectories under realistic constraints. J. Guid. Control. Dyn. 46(9), 1814–1825 (2023)
Article Google Scholar
Kang, C.A.O., Zhang, Y., Jianfei, F.E.N.G.: Failure rate analysis and maintenance plan optimization method for civil aircraft parts based on data fusion. Chin. J. Aeronaut. 38(1), 103219 (2025)
Article Google Scholar
Rajaram, S.: A model for real-time heart condition prediction based on frequency pattern mining and deep neural networks. PatternIQ Min. (2024). https://doi.org/10.70023/piqm241
Article Google Scholar
Husainat, M.: Exploiting graphics processing units to speed up subgraph enumeration for efficient graph pattern minCing GraphDuMato. PatternIQ Min. (2024). https://doi.org/10.70023/piqm24121
Article Google Scholar
Shorrock, S.T., Kirwan, B.: Development and application of a human error identification tool for air traffic control. Appl. Ergon. 33(4), 319–336 (2002)
Article Google Scholar
Kontogiannis, T., Malakis, S.: A proactive approach to human error detection and identification in aviation and air traffic control. Saf. Sci. 47(5), 693–706 (2009)
Article Google Scholar
Li, Q., Jing, R.: Flight delay prediction from spatial and temporal perspective. Expert Syst. Appl. 205, 117662 (2022)
Article Google Scholar
Wu, C.L.: Inherent delays and operational reliability of airline schedules. J. Air Transp. Manag. 11(4), 273–282 (2005)
Article Google Scholar
Gao, H., Xie, Y., Yuan, C., He, X., Niu, T.: Prediction of aircraft arrival runway occupancy time based on machine learning. Int. J. Comput. Intell. Syst. 16(1), 150 (2023)
Article Google Scholar
Lu, M., Peng, W., He, M., Teng, Y.: Flight delay prediction using gradient boosting machine learning classifiers. J. Quantum Comput. 3(1), 1 (2021)
Article Google Scholar
Azam, Z., Islam, M.M., Huda, M.N.: Comparative analysis of intrusion detection systems and machine learning-based model analysis through decision tree. IEEE Access 11, 80348–80391 (2023)
Article Google Scholar
Khodabandelou, G., Kheriji, W., Selem, F.H.: Link traffic speed forecasting using convolutional attention-based gated recurrent unit. Appl. Intell. 51(4), 2331–2352 (2021)
Article Google Scholar
Dalmau, R., Ballerini, F., Naessens, H., Belkoura, S., Wangnick, S.: An explainable machine learning approach to improve takeoff time predictions. J. Air Transp. Manag. 95, 102090 (2021)
Article Google Scholar
Mizan, T., Taghipour, S.: Medical resource allocation planning by integrating machine learning and optimization models. Artif. Intell. Med. 134, 102430 (2022)
Article Google Scholar
Park, G., Song, M.: Optimizing resource allocation based on predictive process monitoring. IEEE Access 11, 38309–38323 (2023)
Article Google Scholar
Seyyedabbasi, A., Aliyev, R., Kiani, F., Gulle, M.U., Basyildiz, H., Shah, M.A.: Hybrid algorithms based on combining reinforcement learning and metaheuristic methods to solve global optimization problems. Knowl.-Based Syst. 223, 107044 (2021)
Article Google Scholar
Silvestre, J., Martínez-Prieto, M.A., Bregon, A., Álvarez-Esteban, P.C.: A deep learning-based approach for predicting in-flight estimated time of arrival. J. Supercomput. 80(12), 17212–17246 (2024)
Article Google Scholar
Huang, Y., Zhang, J., Bao, H., Yang, Y., & Yang, J. (2021, November). Complementary Fusion of Deep Network and Tree Model for ETA Prediction. In Proceedings of the 29th International Conference on Advances in Geographic Information Systems (pp. 638–641).
Zheng, Z., Zou, B., Wei, W., Tian, W.: A data-light and trajectory-based machine learning approach for the online prediction of flight time of arrival. Aerospace 10(8), 675 (2023)
Article Google Scholar

Download references

Funding

This study was supported by the Natural Science Foundation of Sichuan Province [2022NSFSC1902]; Social Science Planning Project of Sichuan Province [SC22C001] and the Fundamental Research Funds for the Central Universities [25CAFUC10032, 24CAFUC03048] and Sichuan Provincial Engineering Research Center of Smart Operation and Maintenance of Civil Aviation Airports [JCZX2024ZZ22].

Author information

Authors and Affiliations

School of Air Traffic Management, Civil Aviation Flight University of China, Guanghan, 618307, Sichuan, China
Qingwei Zhong, Yingxue Yu, Yiru Huang & Tianhang Zhang

Authors

Qingwei Zhong
View author publications
Search author on:PubMed Google Scholar
Yingxue Yu
View author publications
Search author on:PubMed Google Scholar
Yiru Huang
View author publications
Search author on:PubMed Google Scholar
Tianhang Zhang
View author publications
Search author on:PubMed Google Scholar

Contributions

Z QW. and Z TH. contributed to writing original draft preparation and methodology; Y YX. and H YR. were involved in investigation and writing review and editing.

Corresponding author

Correspondence to Qingwei Zhong.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Ethical Approval and Consent to Participate

Not applicable.

Consent to Publish

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Zhong, Q., Yu, Y., Huang, Y. et al. Prediction and Optimization of Civil Aviation Flight Delays Based on Machine Learning Algorithms. Int J Comput Intell Syst 18, 189 (2025). https://doi.org/10.1007/s44196-025-00932-2

Download citation

Received: 23 April 2025
Revised: 08 July 2025
Accepted: 15 July 2025
Published: 24 July 2025
Version of record: 24 July 2025
DOI: https://doi.org/10.1007/s44196-025-00932-2

Prediction and Optimization of Civil Aviation Flight Delays Based on Machine Learning Algorithms

Abstract

Similar content being viewed by others

Enhancing Aviation Efficiency Through Big Data and Machine Learning for Flight Delay Prediction

Enhanced Aircraft Time Delay Prediction Using Weighted Hybrid ML and Dimensionality Reduction

Flight Arrival Delay Prediction Using Gradient Boosting Classifier

Explore related subjects

1 Introduction

1.1 Overview of Flight Delay Prediction in Civil Aviation

1.2 Need for Intelligent Predictive Systems

1.3 Research Problem Definition

1.3.1 Objectives of the Research

1.4 Methodology Overview

1.4.1 Key Contributions

1.4.2 Novelty of the Work

1.4.3 Key Challenges

2 Related Work

2.1 Flight Delays in Civil Aviation: Causes and Impacts

2.2 Machine Learning Approaches for Flight Delay Prediction

2.3 Performance Enhancement Strategies and Decision Support Systems

2.4 Estimated Time of Arrival (ETA) Prediction in Aviation

3 Architecture of FlightNet-ST

3.1 Input Data Processing

3.1.1 Spatio-Temporal Analysis of Traffic Congestion Index

3.1.2 Spatial Graph Modeling of Airport Congestion in FlightNet-ST

3.1.3 Temporal Feature Construction in FlightNet-ST

3.2 Temporal Delay Pattern Modeling with LSTM in FlightNet-ST

3.3 Spatial Delay Pattern Modeling with GCN in FlightNet-ST

3.4 Dynamic Trajectory Delay Pattern Modeling with 3D-CNN in FlightNet-ST

3.5 Feature Fusion and Attention-Driven Delay Prediction

3.6 FlightNet-ST Performance Enhancement and Evaluation

4 Experimental Results and Insights

4.1 Data Source Information

5 Model Development and Evaluation

5.1 Flight Delay Prediction Based on the FlightNet-ST

5.2 Binary Classification Metrics for Delay Prediction

6 Conclusion and Future Enhancement

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical Approval and Consent to Participate

Consent to Publish

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords