Best Practices about how to cut costs in Azure

Introduction

There is no one sizes fits when it comes to Azure and cost optimization, but the focus of this session is to explain some tips & tricks during my daily life as a Cloud Solutions Architect​

Some general tasks can be done monthly/quarterly to be sure that you Azure environment is up to date, taking into consideration that the optimization and your business run are the most important things here​

Be advised that not all the things that can be done in Azure are being covered in this post, probably because at the time of the writing I didn’t have to

Why this post?

Every design in Azure has cost implications, before architecting something, we must consider the budget that we will need for the Project itself, taking into consideration thinks like:

  • Identity different boundaries for scale up
  • Redundancy
  • BCP taking into consideration the cost of the solution
  • Design and set up scalable architectures, focusing on metrics & performance
  • Start small and scale out as soon as the required performance needs it​ (I really love that one)
  • Choose PaaS and SaaS over IaaS, pay only for what you use as a consumer​
  • Always, monitor, Audit & optimize the cost related

Ok I get it, but what are we going to cover?

For the next minutes I will explain some guidelines about cost optimization, in particular for the following topics:

  • Use of ARI
  • Use of Dev subscriptions
  • Optimal use of Azure App Services​
  • Optimal use of Auto-Scale in App Services​
  • Azure Data Factory Failed Pipelines
  • PaaS SQL Optimization
  • Cosmos DB
  • VM Right Sizing
  • Azure Hybrid Benefit​
  • Blob Storage Lifecycle​
  • Networking
  • Clean Orphan Resources​
  • RIs​
  • Use of Log Analytics
  • Use of Azure Advisor
  • Cost Management Preview (ACO Insights)
  • Azure Governance Dashboard
  • Closing

Before starting…

Before starting this post, I would recommend to create an Azure Inventory from your environment, with this tool, it is pretty simple: https://github.com/microsoft/ARI

And as you can observe, it will give you a great overview of what type of resources are you having, which use, locations, etc…Also, some of the sheets can be used to optimize your Azure Cost environment

Also, another tip that I want to give, is you can start your journey to Cost Optimization with a self-done assessment, but it can give you some guidelines about where are you: https://docs.microsoft.com/en-us/assessments/

Use of Dev Subscriptions

Using the top-to-bottom approach, the first thing to pay attention to is Azure Dev/Test subscriptions, which are applicable for both enterprise and pay-as-you-gooffers. By placing your dev resources in those subscriptions, you will get lower prices for most common Azure services for the cost of excluding them from the regular vendor SLA commitments. 

Optimal use of Azure App Services
First, check that standard Plans and Premium plans has an associated application​

I have seen a lot of empty App Services Plan, which leads to unnecessary cost to the customer, remember that having a right governance in your subscriptions it is algo a cost measure.

Another thing that I tend to do is to check the metrics for the plan, and check if are being used properly (scale down in case is needed, but remember the features needed in each case)

Optimal use of Auto-Scale in App Services​

In my case, I can be able to scale down my resources, but first check features between standard and premium plans (or even between std o premium!). Also what it can be done is to scale up/down based on a schedule https://techcommunity.microsoft.com/t5/apps-on-azure-blog/azure-app-service-automatic-scaling/ba-p/2983300

Very useful for those workloads where we know that only are needed in certain periods of time

Azure Data Factory​

Review the failing pipelines​, if a pipeline is constantly failing to run, probably it will impact into the cost of your resource, so take action

Again, I have reviewed a lot of Pipelines in DF which are continuosly failing… take care of that as well

Paas SQL Optimization​

With monitor, check if the database needs all the DTUs provisioned, one thing I love to do, is to play with the different available plans for the SQL, if you’re running a version with a lot of DTU’s, implement a runbook in order to reduce the plan when you don’t need it​, for example, you can use: GitHub – francesco-sodano/azure-sql-db-autoscaling: This ARM Template deploys an Azure SQL Database with DTU Consumption plan (with a new Azure SQL Server) including all the resources required to perform Auto Scaling (scale up and scale down) based on Metric Alerts using a function app. Again, very useful for those workloads where we know that only are needed in certain periods of time

Focus on those DBs with 40%-80% of the DTU capacity​, those are the most imporant to be scale up

Check if you really need the Georeplication, probably you don’t need to replicate your DB across regions (important point!!!), remember the first bullets of this post, we need to start small and then plan big, if we start to put all the georeplication modes to DBs that are not being use or for those in test, you’re wasting your money

Cosmos DB

With the help of metrics, review the use for the correct size & throughput ​

Consider autoscaling for those type of resources (it avoids consuming unnecessary resources)​

Consider serverless options for Dev & Test environments or those environments where intermittent traffic is it used: Consumption-based serverless offer in Azure Cosmos DB | Microsoft Learn

VM Right Sizing​
One thing I love to do, is to shutdown VMs based in a Schedule​

The Schedule is set up with a tag in the resource, and the operation is done by an Automation Account (it could be a Logic App as well)​. For example, I love to use the following script:

Scheduled Virtual Machine Shutdown/Startup – Microsoft Azure | Automys

You can setup the following tag in the VMs

And your VMs will automatically shutdown and start in the configured schedule, which for those test and PRE environments where Azure Reservations does not fit, are simply great, you will save a bunch of computation hours with this simply script

In order to cut costs, we can use spot VMs for non priority tasks (it helps to save some money vs other azure VM sizes), you can get more info in: Use Azure Spot Virtual Machines – Azure Virtual Machines | Microsoft Learn

Get rid of those old VM sizes

One thing that I do for all of my clients in order to optimize cost, is to check which version size are running for the VMs, this can be extracted from the ARI (remember the first tool):

Why? Because as you probably know, Microsoft is always optimising hardware in the Datacenter, so they are pulling new version of the VM size, so what’s the point? the older is the VM size, higher VM cost, so check out if there is any new VM size, and you will be able to save some money from each VM size.

Imagine that you have 100 VMs running in a v2 series, and changing from v2 to v5, represents a change in cost of 20€/VM/month, so in total the save is 2000€/month with only changing the VM to a newer version, not bad uh?

Azure Hybrid benefit

First question is: Do you have a software assurance with Microsoft? If the answer is yes, don’t waste more time and money, and apply it to your Azure Resources, it Will help to sabe up to 40% in cost (for VMs and SQL)

If you want to know how much you can save with this, you can use the Azure Calculator for this purpose: https://azure.microsoft.com/en-us/pricing/hybrid-benefit/#calculator

Storage Lifecycle

With this procedure I was able to save a lot of money in a recent IoT Project, all the information was stored in blobs, but once a certain period of time passed, we moved the information from one tier to another in order to cut storage costs

Networking

Check out costs related with networking, it may scary you​

You Will need to identify which applications are using most of the egress bandwidth and review & redesign your infrastructure accordingly​

Check which gateways are not being used, probably those which have a throughput lower tan 900MB/day​

Check you Azure Express Route Circuits, probably the first provision of the circuit was greater than needed

So, check Azure Monitor: Monitor – Microsoft Azure

Clean Orphan Resources

Are you sure that everything that you have in your subscription are being used? Use this workbook and take action in your subscription: Azure Orphan Resources (microsoft.com)

Save Azure costs deleting those unused disks, Public IP’s which are consuming Storage and account cost (remember that in Azure Advisor we have these recommendations as well):

I’m sure that you will save a bunch of €€€ with this procedure

Use of Log Analytics

If you’re using Log Analytics to monitor your Azure Resoruces, you should add a Daily Cap into your Log Analytics Workspace: https://learn.microsoft.com/en-us/azure/azure-monitor/logs/daily-cap#view-the-effect-of-the-daily-cap

Also a few tips:

  • Use Azure Monitor Agent and Data Collection rules over Log Analytics agent
  • Set retention per table and leave the workspace retention to its default
  • Set archival tier per table – To meet certain compliance rules, you may need some of the data available for a longer period of time
  • Configure diagnostic settings with only the logs that are needed and used

Use of Azure Advisor
I must admit that I’m a fan of Azure Advisor, for any Project that i have, always i tend to revise Advisor​ in order to cut Azure costs

It helps to detect if a Virtual Machine runs on a VM size GREATER than what it needs (based on CPU utilization under 5% in the last 14 days). If the Azure Advisor reports an overprovisioned machine, you need to investigate its use and resize it to a more suitable size.​

For this VM rightsizing purpose, I also use a script from Jos Lieben, which helps to put your underused VM in the right size in terms of load: Automatic modular rightsizing of Azure VM’s with special focus on Azure Virtual Desktop | Liebensraum

Reserved Instances

Reserved instances allow us to reduce cost, there are a lot of resources that can be reserved, take them into account when you’re designing your infrastructure

As you can see, there are a lot of Azure resources available to be reserved, make use of them 🙂

In Azure Advisor always recommend to reserve instances of our resources, don’t forget it​

Azure Budgets

Send notification when a certain amount of money is spent, this can be set at resource group or subscription level, and for example email to the application/subscription owner

Azure Cost Management

Remember to keep a closer look to the latest updates from Cost Management: https://azure.microsoft.com/en-us/blog/microsoft-cost-management-updates-november-2022/ I’m sure that you’ll take profit of those new features 😉

Insights, is the new feature of ACM allows us to have some insights about our daily spending in Azure resources, we can detect what is a tendency, and what is a cost anomaly in our subscriptions

Azure Governance Dashboard

If you want to deploy a High-Level Visualization in PowerBI of your azure resources, you can implement the CCO Dashboard from GitHub: https://github.com/Azure/ccodashboard

I know that this is more related with governance, but it helps to have a bird’s eye into the different resources and Azure subscriptions.

PRO TIP

If you really like those cost recommendations, there is a toll in github: https://github.com/helderpinto/AzureOptimizationEngine which can enchange the Azure Advisor recommendations and help you to optimize your environment

Closing

That’s all, probably some of the recommendations are already being followed by you, but I hope this post was interesting to you 😊

Till next time, merry Christmas and happy holidays!

Advertisement

OATH Hardware Tokens for AzureAD

As you probably have been reading in my previous posts, I’ve been talking about FIDO2 keys, and how it can be used as a secondary authentication when signing in AzureAD.

Today, I want to talk about OATH hardware Tokens, known as Time-based One Time Password Tokens as well.

As you are aware, some authentication methods can be used as the primary factor when you sign in to an application or device, such as using a FIDO2 security key or a password. Other authentication methods are only available as a secondary factor when you use Azure AD Multi-Factor Authentication or SSPR.

The following table outlines when an authentication method can be used during a sign-in event:

But, OATH TOTP is an open standard that specifies how one-time password (OTP) codes are generated. OATH TOTP can be implemented using either software or hardware to generate the codes. OATH TOTP hardware tokens typically come with a secret key, or seed, pre-programmed in the token.

In this post, I will show you how the OTP C200 token from Feitian can be configured in Azure AD and how it works.

First of all, what you have to do is to register the key in Azure AD, in order to do this, you will need the Serial Number from the Key, and the secret key provided by the manufacturer, and then you need to create a CSV file with all the information:

Once you have done this, these keys must be input into Azure AD: Multifactor authentication – Microsoft Azure

Upload the file, and activate the key in the portal, once it is have been done, it will show you a screen like the following:

If you have any error during the upload, it will be shown in the portal itself:

You must consider that you can activate a maximum of 200 OATH tokens every 5 minutes.

Also, as you probably figure out, users may have a combination or OATH Hardware tokens, Authenticator App, FiDO Keys, etc…

Be aware that users con configure their default sign in method in the security info web: My Sign-Ins | Security Info | Microsoft.com

So, once the key has been configured for the user, which is the flow to access to the account?

I have compared the Authentication flow with the Fido2 Key Flow, the difference that you can appreciate is with FiDO2 Keys is not necessary to include my password

Finally, check out the following table from Microsoft, where you can see different persona cases and which passwordless technology can be used for each one of them

IMHO, FiDO Keys are great, but thinking as an end user they have problem: the first setup: We must rely on end user about how they configure the key and associate it with azure AD (remember the previous table). FiDO keys has the advantage to be able to be used to sign in instead of using a password in the computer.

In the other hand, OAUTH keys are great, because you as an administrator, can configure the keys in the AAD Portal, and once have been activated provide them to end users, without necessity to do any other action from the end user perspective, and the most important part, are very easy to use

Thanks to Feitian for providing such amazing tokens

Why you should block legacy authentication

Currently, we could say that Legacy Authentication is one of the most compromising sign-in, luckily for us, older protocols have been replacing with modern authentication services, taking the advantage that MA supports MFA, while Legacy Authentication refers to all protocols that use Basic Authentication, and only requires one method of authentication.

So, it is important thar for security reasons we need to disable legacy authentication in our environments, why? Because enabling MFA isn’t effective if legacy protocols are not blocked. For example, the following options are considered legacy authentication protocols:

  • Authenticated SMTP – Used by POP and IMAP clients to send email messages.
  • Autodiscover – Used by Outlook and EAS clients to find and connect to mailboxes in Exchange Online.
  • Exchange ActiveSync (EAS) – Used to connect to mailboxes in Exchange Online.
  • Exchange Online PowerShell – Used to connect to Exchange Online with remote PowerShell. If you block Basic authentication for Exchange Online PowerShell, you need to use the Exchange Online PowerShell Module to connect.
  • Exchange Web Services (EWS) – A programming interface that’s used by Outlook, Outlook for Mac, and third-party apps.
  • IMAP4 – Used by IMAP email clients.
  • MAPI over HTTP (MAPI/HTTP) – Used by Outlook 2010 and later.
  • Offline Address Book (OAB) – A copy of address list collections that are downloaded and used by Outlook.
  • Outlook Anywhere (RPC over HTTP) – Used by Outlook 2016 and earlier.
  • Outlook Service – Used by the Mail and Calendar app for Windows 10.
  • POP3 – Used by POP email clients.
  • Reporting Web Services – Used to retrieve report data in Exchange Online.
  • Other clients – Other protocols identified as utilizing legacy authentication

How can we monitor the usage of legacy authentication in Azure AD?

Thanks to Log Analytics, Insights and workbooks, we are able to monitor the use of those protocols, for instance:

And check the non-interactive sign-ins (be careful with ADConnect sync accounts):

What we can do to avoid this?

The best way to block or report legacy authentication for users is use Conditional Access policies (Does my organization need Azure AD Conditional Access? – Albandrod’s Memory (albandrodsmemory.com) & Enabling zero trust security in your environment – Albandrod’s Memory (albandrodsmemory.com)

But the best way is creating a CA policy:

My final advice

Legacy authentication must be disabled to protect our environments, but first, start small and analyse the impact in your organization.

Till next time!

Best practices when updating Lync Server

Cumulative Updates (CU) are kind of a service pack that comes out quarterly for Lync Server and the clients. It includes fixes and some times new functionality is added.

For Lync Server 2010 is it possible to download from the following url: http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=11551

For Lync Server 2013, is the following: http://www.microsoft.com/en-us/download/details.aspx?id=36820

As you can see there are a lot of files to download and you could just download and update specific components or you can download the LyncServerUpdateInstaller.exe package that includes all the latest updates. So go ahead and download it and then copy the file to your Lync Servers.

To start the update process log in to your server. And start the Lync Server Management Shell

First, check that no users are talking on the phone or are in a meeting before you start the update. You can do this by running Get-CsWindowsService

l1
The next command would be to prevent new sessions for a while and drain the active connections. This can be done running Stop-CsWindowsService –Graceful
l2
As seen in the picture the services is now stopped.

3.Next thing would be to stop the World Wide Web service. By typing: net stop w3svc
4.Now Close all Lync Server Management Shell windows.
5.Install the cumulative update for Lync Server 2010 by running LyncServerUpdateInstaller.exe

l3
This will start the update tool and you should se what updates are needed and what version is already installed. (As you can see in the picture I have already installed the latest update package and it shows a green checkmark at every line. If there were some services that wouldn’t be updated this would show a red stop mark instead.)

Restart the computer if you are prompted to do so
he next step is something that is almost always forgotten. To update the Lync Server Databases (this step is normally not done if you just used Windows Update to update your Lync server and should then be done manually after Windows Update has updated your server.)

1.Start the Lync Server Management Shell: (Click Start, click All Programs, click Microsoft Lync Server 2010, and then click Lync Server Management Shell.)
2.To apply the changes made by LyncServerUpdateInstaller.exe to the SQL Server databases do one of the following:
1.On Standard Edition Server and Enterprise Edition: Front end servers, once you have installed update for core components, the updated sql files will be dropped on the server. Then run the following cmdlet to apply the changes:

Install-CsDatabase -Update -ConfiguredDatabases -SqlServerFqdn -UseDefaultSqlPaths

If the RTCDyn databases are removed after you run the cmdlet without the UseDefaultSqlPaths parameter, run the following cmdlet to restore the RTCDyn databases:

Install-CsDatabase -Update -ConfiguredDatabases -SqlServerFqdn -DatabasePaths

7.Now when the database is also up to date, its time to start the IIS & Lync Server services. At the command line, type:
net start w3svc
Start-CsWindowsService

And that’s all!