Azure AD Graph API Throttling Guidance

I have been working with Microsoft Support and Product Groups for a while now to try and get some formal guidance around  Azure AD Graph API throttling, and also the Microsoft Graph API. Microsoft have acknowledged that this is something that is not clearly documented and have advised that they will be releasing some documentation in the near future.

This post is to share some of the information that has been obtained from working with the support and product teams. Hopefully this will be useful to some people in the interim, until formal documentation has been released.

Please note, this information relates specifically to the Azure AD Graph API, it could be assumed that the Microsoft Graph API has the same behaviour but it cannot be guaranteed. Also, this information has come from the support channels and may be subject to change and should be used as a guide only, until formal documentation is published.

Why do we need a throttling policy?

To first understand some of the limits and responses to the Azure AD Graph API throttling we first need to understand what throttling is, and why it is required. Azure AD, like most cloud services in Azure and Office 365 operates as a shared service. This means that the same service, whether it be API endpoints, virtualisation hosts, storage clusters, etc. are shared by different customers.

Throttling aims to prevent or limit the amount of resources a single customer can have on the overall service, so that other customer’s services and experiences are not negatively impacted. With the Azure AD Graph API, it is quite difficult for Microsoft to provide hard limits around throttling, as the service is dynamic and different circumstances may affect the overall performance of the service. To simplify, this means that at any given time it is possible for the service to be able to handle more or less requests.

Rather than providing hard fixed limits for the Azure AD Graph API, like they do with many other Azure Services, Microsoft provide varying responses to your calls to allow you to dynamically and programmatically wait and back-off.

Azure AD Graph API Throttling responses

There are various responses that you may receive when calling the Azure AD Graph API, and this is the prescribed guidance to handle these responses. It is also worth noting that there could be rare scenarios, where the overall service is experiencing high load, and your tenant may be throttled even if your own tenant is not generating a high volume of requests against the API.

  • HTTP Status Code 429 with simple HTML response “Too Many Requests” (No OData response)

There is an entry point that controls traffic into the Graph API service. From working with the product support  there is a limit of 1000 requests per second to this entry point from a single source IP. Secondly, there is also a separate ApplicationID+TenantID limit in place and this is 120 requests per second. If you require more than this, then you should look to spread the requests across multiple source IPs and applications.

There is also a dynamically changing tenant specific write request limit in place. It is possible that the total sum of all write operations across all applications reaches the tenant limit before either of the preceding limits are hit.

If the entry point generates “Too Many Requests” response it is recommended to back off for 5 minutes

  • HTTP Status Code 429 with OData response.

{“odata.error”:{“code”:”Request_ThrottledTemporarily”,”message”:{“lang”:”en”,”value”:”Your request is throttled temporarily. Please try after 284.1295407 seconds.”},”date”:”2016-10-06T02:22:331″,”requestId”:”xxxxxx-xxxx-xxxxx-xxxxxx”,”values”:[{“item”:”BackoffTime”,”value”:”284″}]}}

If the service returns a throttle response, it is recommended to follow the back off time.

  • HTTP Status code 503 Service Unavailable.

In this case either the Entry Point or Graph API Service is overwhelmed and it is recommended to back off 5 minutes