New ESnet Report Looks at AI-Powered Future for Scientific Networking
By Bonnie Powell, media@es.net
Key takeaways:
- Artificial intelligence usage is transforming scientific research, increasing data traffic. A new report from ESnet outlines how AI and machine learning, paired with strong data stewardship, can help manage the growing scale and complexity of DOE’s high-performance scientific network.
- The resulting 29 work-packages and 31 recommendations set clear priorities in four focus areas — data management, traditional analytics, AI methods, and user experience.
- Trust and transparency are non‑negotiable. ESnet’s AI strategy prioritizes explainable systems, human oversight, ethical safeguards, and continuous feedback to ensure reliable, mission‑aligned operations.
In February 2025, a group of network and software engineers, data scientists, and AI experts from both academia and industry gathered in Berkeley, California, for a two-and-a-half-day deep dive into the future of scientific networking. The occasion was the Energy Sciences Network’s (ESnet’s) first-ever Data and AI Workshop — a focused effort to explore how data-driven methods paired with artificial intelligence/machine learning (AI/ML) could transform the way one of the world’s leading research networks operates. And in July, ESnet released a 200-page report (https://doi.org/10.2172/2571672) summarizing the workshop’s findings and the organization’s next steps.
For ESnet, this wasn’t just a brainstorming exercise. The Department of Energy runs dozens of important data-intensive scientific facilities — particle accelerators, high-intensity light sources, genomics labs, neutron-scattering centers, and more. These instruments generate torrents of complex data that must be moved, processed, and shared reliably and at lightning speed between scientific collaborators who may be a continent apart. ESnet is the high-performance “data circulatory system” that makes that possible, linking tens of thousands of researchers at the 17 national laboratories, the DOE’s 4 supercomputing facilities and 24 other user facilities, and 30,000 organizations around the globe.
The scale of this mission is growing rapidly. As experiments become more sophisticated and data volumes explode, ESnet’s network has evolved in complexity. That’s why ESnet is looking to AI and advanced data analytics: not because they’re trendy, but as a necessary evolution to keep pace with science.
Networking Gets Neural
Modern networks generate vast streams of telemetry data, respond to constantly shifting user demands, and face an ever-changing landscape of security threats. Historically, network operators have relied on human expertise coupled with static rules to keep things running. But that model is straining under the weight of scale and complexity.
AI offers a way forward — not to replace human expertise, but to augment it. Machine learning algorithms have the potential to spot subtle patterns in performance data, predict failures before they happen, and even automate certain operational decisions. The promise is a network that can plan, adapt, and defend itself in ways that are simply beyond human capacity alone. For ESnet, the challenge is to understand exactly where AI can deliver the most value, what kinds of data are needed to feed it, and how to ensure that any AI-driven system is transparent, trustworthy, and aligned with the network’s mission.
The February workshop brought together around 50 ESnet staff from across the organization — from Planning & Innovation, Network Services, Systems & Software, Security, Science Engagement, and the Business Office. They were joined by six invited experts from academia and industry, each of whom offered a distinct viewpoint on AI in networking, revealed by individual presentations and a lively panel discussion:
- Arpit Gupta, UC Santa Barbara Associate Professor and Berkeley Lab faculty scientist, mapped out a technical roadmap for moving from rule-based automation to AI-powered operations.
- Claudionor Coelho, Jr., Chief AI Officer at Zscaler, explored how large language models and generative AI could be applied to network operations (AIOps).
- Sangeetha Abdu Jyothi, UC Irvine Assistant Professor, stressed the importance of explainability in deep learning systems — a critical factor for trust.
- Taghrid Samak, Engineering Manager at Meta, shared how machine learning is reshaping network planning and optimization.
- Vyas Sekar, Carnegie Mellon University Professor, Cofounder/Chief Technologist of Rockfish, and Chief Scientist of Conviva offered a high-level architecture for enabling AIOps in next-generation networks.
- Walter Willinger, Chief Scientist at NIKSUN, provided a healthy dose of skepticism, from his perspective as a security expert, challenging assumptions about AI/ML’s value.
The workshop unfolded in five sessions, each building on the last. First came defining the problems ESnet needs to solve for its next-generation network, ESnet7. Then the group examined the data available today — and the data that tomorrow’s applications may demand. The third session was an “AI primer,” grounding participants in what AI can and can’t do. The fourth session focused on turning ideas into concrete “work-packages” that paired problems with the right data and methods. Finally, the group looked across all the packages to identify common themes and priorities.
Data: The Fuel for AI
As the new report documents, one theme emerged quickly: AI is only as good as the data it learns from. ESnet has a wealth of operational data — from network performance metrics to incident response tickets — but as with most of today’s institutions, this data is scattered across systems, stored in different formats, and sometimes inconsistently documented. That makes large-scale analysis harder than it should be.
Some of ESnet’s most valuable data comes from end-to-end scientific workflows that cross multiple administrative domains. Collecting this data today is often a manual, time-consuming process. Improving accessibility, curation, and metadata standards could unlock new insights and make AI-driven automation far more effective.
The group also saw untapped potential in human-generated content — documentation, trouble tickets, and other narrative records. With natural language processing (NLP), this information could be mined to improve troubleshooting, training, and knowledge sharing by staff.
Traditional Analytics vs. AI — A Hybrid Future
Today, ESnet relies heavily on traditional analytics: statistical models, performance monitoring dashboards, and rule-based alerts. These tools work well for many tasks, such as detecting hardware failures or spotting obvious performance issues. But they struggle with more complex, multivariate problems — especially when data comes from multiple sources or includes unstructured text — and they cannot predict failures.
The group agreed that AI wasn’t ready to replace traditional methods anytime soon, given the mission-critical needs of the network, but it could likely complement them. In some cases, the best approach might be a hybrid: combining statistical techniques with NLP or machine learning. The key is to focus on use cases with high operational value and high-quality data, ensuring that efforts deliver a strong return on investment.
Several promising areas for AI emerged from the discussions:
- Anomaly detection — spotting unusual patterns in trends, rule-based logic, or semi-structured data before they cause problems.
- Root-cause analysis — helping engineers quickly understand why an issue occurred, based on historical data and context.
- Predictive modeling — forecasting network performance, anticipating failures, and guiding capacity planning.
- Automation — using AI agents and workflow engines to take action on certain issues with reduced or minimal human intervention.
- Data accessibility — making it easier to query and navigate ESnet’s vast operational records using natural language.
These capabilities could make ESnet’s operations more proactive, resilient, and efficient — freeing up the human experts to focus on higher-level problem solving.
Trust, Transparency, and the Human in the Loop
The workshop made it clear that technical capability is only half the battle. For AI to succeed in ESnet’s high-performance environment, it must be transparent, accountable, and trustworthy.
That means providing users with clear explanations of how AI systems reach their conclusions — no one trusts a black box — aligning AI workflows with ESnet-specific data and use cases, and maintaining rigorous data protocols. It also means keeping humans in the loop for important decisions, with well-defined retraining and monitoring processes to ensure models stay accurate over time.
Ethics and bias were also key topics. The group stressed the importance of labeling AI-generated outputs, giving users control over how AI is applied, and protecting sensitive information. Continuous user feedback and transparent documentation will be essential to building — and keeping — trust.
Moving Forward
By the end of the workshop and the collaborative report-writing process, ESnet had distilled its discussions into 31 recommendations, ranging from tactical fixes to broad strategic initiatives, and settled on 29 work-packages. Together, they form a blueprint for how ESnet might integrate AI into its operations in a way that is technically robust, operationally valuable, and aligned with the network’s role as the DOE’s scientific data backbone.
“The report is both a roadmap and a call to action,” said Chin Guok, ESnet Chief Technology Officer, head of the Planning & Innovation department, organizer of the February workshop, and the report’s main author. “AI and advanced analytics will be essential to managing the ESnet network as we scale to meet the rising demand for AI and high-performance computing. But for these efforts to succeed, we’ll need to ensure good data stewardship, transparency, and trust in how AI systems are designed and used.”
For ESnet, the February workshop and the resulting report are just a starting point — a chance to map the terrain, identify the obstacles, and outline a path forward. This fall, ESnet plans to kick off four projects, two each in Technology Foundations and Operational Tasks, covering 16 of the 29 work-packages. The AI landscape is evolving quickly, and ESnet’s strategy will evolve with it. But the direction is clear: the future of scientific networking will be data-driven, AI-enhanced, and designed with human guardrails.