This post first appeared on Stackify’s blog. You can check it out here.
The performance gains were going to make everything better. Costs would go down since you’d only be paying for the resources you were actually using.
It sounded magical. You’d look like a star and save your company money.
Unfortunately, when all was said and done, the cloud didn’t deliver.
So where did things go wrong? Is it your app? Is it the cloud itself? Or was there something else that caused this pain?
Maybe it’s not the cloud itself that’s slowing your systems down. Maybe it’s just the way your apps and networks are configured. If the cloud isn’t delivering as you thought it would, it may be time to analyze your systems to see whether bulky dependencies are slowing you down.
You can’t look at each application or component in isolation. If the goal of moving to the cloud is to increase performance and reduce costs, look at your system and map how it works with your dependencies—and at the boundaries of those dependencies. Analyze interactions between systems and see how everything runs as a whole.
You could try to do this manually or rely on documentation. But you’d be making a mistake. Dependencies will be missed and key interactions with other systems will be overlooked—if for no other reason than your computing environment is unique to your organization.
For the best results, use an application performance management (APM) tool like Retrace to not only map and identify your dependencies, but also to monitor them on a consistent basis. APM tools make it easy for you to see what is being used, how often, and why. Identifying where inefficiencies lurk has never been simpler.
Historically, optimizations have been found in the database layer. The same holds true for cloud-based applications. You may be able to speed up your apps just by tweaking your database a bit.
In order to know what to tune, you need metrics on your database use. Some metrics to consider include:
- Frequency of database calls. How often are you calling the database?
- Latency of queries. How long are those queries taking?
- Bandwidth. How much data are you putting in or getting out?
Instead of calculating those metrics by hand, use an APM tool to identify the queries used the most. You can also see what functionality in your application triggers these queries. Once you’ve identified the bottlenecks, you’ll have an easier time moving forward with performance tuning.
You’ll also want to look at your utilization. Database and document storage resources are often underutilized. Take a look at how much you have allocated and see if you can scale back without adverse effects. For efficiency’s sake, allocate for the data needs you have today—not the needs you will have two years from now.
And, finally, consider whether you need a database at all. Could you forgo the database completely and instead use some lightweight caching? Obviously, this depends on your application and needs. But it’s a question that not enough architects ask.
Speaking of caching, it looks like we’ve found another dependency.
Often seen as an improvement in performance and resource utilization, caching is a great way not only to reduce calls to the database but to other external systems and APIs, too.
However, caching is not a silver bullet.
If you are caching data in memory with tools like Memcached, remember that it can affect your performance. Caching is not supposed to be the junk drawer of your application. Do your due diligence when deciding what to cache and what not to cache.
If you cache a bunch of your database query results and use up memory, for example, you may be slowing your application down while also paying higher hosting costs. Sometimes the best approach is simply calling the database when needed and not caching everything in sight.
Message queues are a great way to reduce processing costs.
If you have an application that has to run processes on data, every piece of data that comes in will absorb another percentage of your resources.
Using queues will help balance those costs by letting only a certain number of running processes go at a time. Everything else queues up and waits patiently until it’s their time.
When your application calls other web services and APIs, it could be increasing your processing costs—especially if those calls are synchronous and your thread is waiting for a response. To reduce expenses, look for ways to call external APIs asynchronously when possible.
Also, if you’re using microservices, see if you can restructure your applications to make better use of the architecture. When looking at your own web services, more is not always better. Depending on your metrics, it may be time to consolidate some of those services.
Let’s take a look at an example. Imagine one of our microservices is retrieving a large chunk of data and then sending it to another microservice for processing. In addition to the standard hosting costs of each service, we could be impeding our performance. Both microservices need to be able to hold onto this large data for each process or transaction. We’ve also added a network call. And, as a bonus, we’ve added a layer of marshaling and unmarshalling on each end.
Perhaps it’s time to look at combining the service that retrieves the data from the database and the service that processes that data. We’ll reduce the costs and add a performance improvement to our process.
Another example of overly fragmented services occurs when our applications are calling a common utility service—perhaps one that emails or notifies customers. Overall, our services don’t make this call too often or maybe it’s not that resource-intensive. If we can include these notifications as a jar, module, or DLL in other services, it could reduce hosting costs, too.
It’s easy to get lost in the world of dependencies.
There are a lot of moving pieces. Don’t look at each application, component, and dependency on its own. If the goal is increasing performance and reducing costs, you need to look at the boundaries and the interactions between systems to see how everything runs as a whole.
And, thanks to your APM tool, you see what resources are being used and what systems and dependencies are using them.
Using naive physics to find and tune critical resources provides a simple way to determine what changes can improve performance between dependencies.
The process is straightforward: Tweak configurations in your existing hosting plan and see if performance improves.
For example, if many of your applications have a microservice that they depend on (e.g., a messaging service), then we can do a few things. First, let’s increase the instances of our messaging service and see if performance improves. If it does, great. This may be a critical resource to your systems. But perhaps performance doesn’t improve. Then you can move onto testing and validate something else.
Next, let’s try increasing either the memory or CPU on the messaging dependency and see if there is an improvement in overall performance. Using our APM tool, we’ll be able to see if that made a difference.
If you see a change after your tweaks, you’ve identified a critical resource. Locating all of your critical resources will help you figure out what you really need to scale and what can live with fewer resources.
You can’t always control your dependencies. Some of them are black boxes that don’t give you much access. Others have limited configurability.
Instead of looking at each dependency on its own, look at the whole system as a set of connected components. See what data is available to monitor and infer what types of changes can not only improve performance but also reduce your costs.
Ultimately, without proper monitoring capabilities, you will be adjusting things blindly. You won’t be able to really find the problem or opportunity for improvement because you’ll be forced to follow hunches and instinct.
On the flip side, using a tool that produces the data you need to know, with certainty, where improvements can be made makes it that much easier to achieve the results you’re looking for. With the right solution in place, you can unlock the true power of the cloud—increasing performance and reducing costs along the way.