Your top network performance problems and how to fix them
Group Product Manager, Network Intelligence Center
Got network trouble, and don’t know where to start? Whether you’re trying to troubleshoot a performance problem or understand your network performance in order to make optimal deployment decisions, Google Cloud has a comprehensive set of tools for network monitoring, verification and optimization. With these tools, you can visualize, measure, troubleshoot and optimize network performance on Google Cloud, as well as in your on-premises and hybrid environment.
These tools will help you answer any number of questions about your network performance. But working with customers, we’ve noticed some network performance questions that come up again and again. In this blog post, we’re going to take you on a whirlwind tour of these tools, and show you how to use them to answer your most common network performance questions.
Network performance management on Google Cloud and beyond
Before we delve into the different performance troubleshooting scenarios that networking teams encounter, let's take a quick look at our offerings for troubleshooting the Google Cloud network, and beyond.
Network Intelligence Center is Google Cloud’s comprehensive network monitoring, verification and optimization platform for use across on-prem and cloud environments. Our vision with Network Intelligence Center is to enable what we call ‘intelligent network operations,’ enabled today by four modules, with several modules to follow. These include:
PerfKit Benchmarker is an open-source tool created at Google that allows you to measure and understand performance across multiple clouds and hybrid deployments. PerfKit Benchmarker is a great tool for benchmarking your network performance to guide deployment decisions. Using PerfKit Benchmarker, we have also introduced a live dashboard measuring median inter-region performance metrics.
Now, let’s take a look at some common network performance scenarios that you’ve probably been asked to troubleshoot in your role as a network engineer.
1. The application is down or performing poorly
In this scenario, the networking team has to triage whether or not the underlying network is the root cause. Network Intelligence Center’s Performance Dashboard can show you real-time performance metrics (latency and packet loss) between the zones where you have VMs, enabling you to quickly troubleshoot where the packet loss is happening—and indeed, if it’s a networking issue at all.
According to Chris McKean, Senior Network Engineer at AutoTrader UK, “Performance Dashboard in Network Intelligence Center has saved us several hours of fault finding and support calls. By highlighting packet loss in a particular Google Cloud zone, it has enabled us to quickly identify the root cause of real-time network issues.”
Performance Dashboard is now in GA and you can access it from the Network Intelligence Center console.
2. The network is being blamed for an outage
Network Intelligence Center’s Connectivity Tests module is your go-to for diagnosing connectivity issues. Connectivity Tests lets you self-diagnose connectivity issues within Google Cloud Platform (GCP), or from GCP to an external IP address that’s on-prem or in another cloud; that way, you can isolate whether or not the issue is in GCP. It can diagnose connectivity issues related to misconfigurations, so you can understand the impact on connectivity, and proactively resolve connectivity problems that could lead to performance issues.
3. Users in a region are experiencing delays
The Network Topology module in Network Intelligence Center allows you to visualize your network and the associated network performance metrics, to better monitor network health. For instance, you can easily visualize how your users are being served worldwide, and whether they are being served out of the closest geographical region.
4. You need to benchmark performance to make deployment decisions
If you are thinking of migrating workloads to cloud, then you need visibility into expected network performance metrics so you can choose the best cloud and deployment architecture for your use case. PerfKit Benchmarker makes network performance benchmarking fast and easy by automating network setup, provisioning of VMs, and test runs. To help with your decision, you can use the live dashboard to view median inter-region latency and throughput for Google Cloud networks. Learn how to reproduce these results using PerfKit Benchmarker here. You can also use this methodology whitepaper to learn how to run a wide range of multi-cloud and hybrid network performance tests.
With the right tools, you can get a comprehensive understanding of your network performance, make well informed decisions about where to put your workloads, prevent outages, and triage and troubleshoot performance issues quickly—all so that your users get the best possible experience. If you have other network performance questions you’d like us to tackle, reach out to us here.