
Last year marked a pivotal moment in the technological revolution, as evidenced by the Nobel Prizes in Physics and Chemistry being awarded to scientists working at the intersection of artificial intelligence (AI) and basic science. AI for Science is not just redefining traditional paradigms of experiments and theoretical research but also injecting fresh, innovative momentum into virtually every industry by leveraging data, computation, and algorithms. Through robust software engineering practices, AI-driven approaches are reshaping our lives and work at unprecedented speed and efficiency, yielding novel applications in countless interdisciplinary fields.
In this dynamic landscape, Haochen Sui, a leading software development expert at VMware, has long focused on the fusion of software engineering and artificial intelligence. Over the years, with a career dedicated to solving real-world problems through cutting-edge technology, he has successfully integrated AI solutions into multiple domains, achieving remarkable breakthroughs along the way. His research spans multiple domains, from Time Series Forecasting to Healthcare Diagnostics and Recommendation Systems, each project building on the last to address increasingly complex challenges. Sui's work shows how software engineering can elevate AI, which makes it not just a tool for innovation but a driver of cross-industry transformation.
Breakthrough Work to Deal with "Uncertainty": Faithful Time Series Forecasting Engine
One of Sui's most notable achievements came in 2024 with his groundbreaking work on faithful time series forecasting. Time series data, which tracks changes over time, is ubiquitous in industries like manufacturing, finance, and healthcare. It is crucial for predicting trends, managing risks, and allocating resources. However, in the context of AI for Science, traditional time series models often struggle with real-world complexities, such as noise and uncertain perturbations, which can destabilize predictions and limit their practical applications.
Sui led his team to tackle this challenge head-on by developing an innovative, faithful time series forecasting engine. This framework, rooted in information theory and statistical principles, ensures that predictions remain accurate and robust even in the face of data disturbances or external interference. The engine includes several innovations:
- Data Preprocessing and Wavelet Transform-Based Decomposition: The time series data undergoes thorough cleaning and segmentation, followed by wavelet transform-based decomposition to separate low-frequency trends from high-frequency details, enabling more precise and insightful analysis.
- Triple Constraints for Risk Resistance: The model is fortified with three key constraints—similarity in information bottleneck space, consistency in prediction space, and stability under noise perturbations—to enhance its resilience against noise perturbations.
The above innovative processes are seamlessly integrated into the training pipeline. The result is a forecasting engine that not only outperforms traditional models in accuracy but also maintains its stability and reliability under uncertainty and adverse conditions. This innovation has far-reaching implications, from improving resource management in cloud computing to enhancing disease prediction in healthcare, which can be applied in various fields. Sui's work has set a new standard for time series forecasting, which demonstrates how software engineering can transform AI into a robust tool for real-world applications.
Advancing Robust Healthcare Diagnostics with Invariant Spatiotemporal Learning
Building on his success in time series forecasting, Sui turned his attention to an even more pressing issue: how to enable robust healthcare diagnostics under a more severe distribution shift than just noise perturbations? In 2024, he introduced a novel invariant spatiotemporal representation learning framework designed to address the challenges posed by heterogeneous patient data. In healthcare, cross-patient distributions can shift widely, which makes it difficult for traditional models to maintain accuracy across different populations. For example, an AI model trained on EEG data from one group of patients might perform poorly when applied to another group with different physiological profiles.
Sui's solution was to develop a framework that extracts patient-invariant features from EEG data, ensuring that predictions remain stable and accurate regardless of the patient's unique characteristics. The framework involves these novel designs:
- EEG Data Preprocessing and Invariant Mask Function: EEG data are optimized to separate features into invariant and variant representations. The invariant component includes key signals of disease type. Also, variant representation records the noise and artifact information.
- Tri-fold Invariance Constraints: Ensuring invariant spatiotemporal features learning across different patient groups through three constraints—different patient group invariance, self-supervised constraints on disease labels, and variance of the gradient penalty on loss directions.
Based on this design, Sui's invariant spatiotemporal representation learning software framework maintains predictive stability in the face of large distribution gaps across patient populations. This framework has been successfully applied in clinical settings, particularly in seizure-type classification, where it has provided clinicians with reliable auxiliary diagnostic tools. By ensuring that AI models can adapt to diverse patient populations, Sui's work has opened new possibilities for personalized medicine and improved healthcare outcomes.
Continuing Research to Address "Uncertainty": Unlearning-Driven Debiasing in Recommendation Systems
By 2025, Sui had expanded his research to address another critical issue for uncertainty: selection bias in recommendation systems. These systems, which suggest products, services, or content to users, are plagued by selection bias—users tend to interact only with items that align with their preferences, leading to skewed data and suboptimal recommendations. Additionally, user preferences evolve over time, introducing further instability.
Sui's response was to propose an unlearning-driven debiasing software architecture grounded in causal inference, a novel approach that leverages machine unlearning concepts to correct for bias. The objective is to maintain trustworthy and accurate recommendation results even as the user bias evolves. The architecture operates through those key innovative contributions:
- User Bias Identification: A dedicated network identifies users with severe selection biases and quantifies the extent of bias that needs to be "unlearned."
- Adversarial Pseudo-Label Generation: The system generates error-maximizing pseudo-labels to balance user selection bias, which are then fused with real labels.
- Debiased Prediction Modeling: The model optimally adjusts recommendations by leveraging the unlearning rate and adversarial pseudo-labels, ensuring unbiased and fair predictions.
This groundbreaking approach, as the first work to address selection bias in recommender systems from a machine unlearning perspective, ensures that recommendation systems remain fair, stable, and accurate, even as user behavior changes. It represents a significant leap forward in the field and offers a new perspective on how to handle uncertain bias in AI systems.
Research Impact and Future Directions
Haochen Sui's research journey—from handling external perturbations (faithful time series forecasting) to managing distribution shifts across heterogeneous patient populations (invariant spatiotemporal learning) and finally to tackling systemic selection bias in recommendation systems (unlearning-driven debiasing)—illustrates a clear progression in tackling increasingly complex forms of uncertainty in real-world data, and showcases how robust software engineering principles can elevate AI for Science to broader, cross-industry applications. Sui's work not only advances fundamental research in machine learning but also demonstrates the transformative power of AI when engineered for stability, fairness, and adaptability.
His findings have been published in top-tier conferences such as IJCAI 2024, NeurIPS 2024, and AAAI 2025, and he has been invited to present his work there, where he shared his insights with global peers. Through his relentless pursuit of innovation, Sui has cemented his reputation as a leading figure in the artificial intelligence and software engineering communities.
Looking ahead, Haochen Sui's contributions will undoubtedly continue to shape the innovation frontier. His work serves as a blueprint for how software engineering can drive AI innovation across multiple disciplines, from forecasting and healthcare to recommendation systems and beyond. In a world increasingly reliant on AI, Sui's research offers a glimpse into a future where technology is not only powerful but also equitable, reliable, and transformative.