Is there a way to determine why Azure App Service rebooted?

Question

Is there a way to determine why Azure App Service rebooted?

I have a bunch of websites running on the same Azure App Service instance, and they are all set to Always On. All of them suddenly restarted at the same time, forcing everything to move slowly, for several minutes, when everything fell into a cold request.

I would expect this if the service moved me to a new host, but this did not happen - I'm still on the same host name.

The CPU and memory usage was normal during the reboot, and I did not initiate any deployments or anything like that. I do not see the obvious reason for the reboot.

Are there any entries anywhere where I can see why they are all reloaded? Or is this just the normal thing that the App Service does from time to time?

+8

azure azure-web-sites

Nicholas piasecki Jul 10 '17 at 21:10

source share

1 answer

Nicholas piasecki · Accepted Answer · 2017-07-21T16:07:56+0000

So, it seems that the answer to this question is "no, you cannot know why, you can simply conclude that this is so."

I mean, you can add an application log, for example

private void Application_End() { log.Warn($"The application is shutting down because of '{HostingEnvironment.ShutdownReason}'."); TelemetryConfiguration.Active.TelemetryChannel.Flush(); // Server Channel flush is async, wait a little while and hope for the best Thread.Sleep(TimeSpan.FromSeconds(2)); }

and you will get "The application is shutting down because of 'ConfigurationChange'." or "The application is shutting down because of 'HostingEnvironment'." but it really doesn’t tell you what is going on at the host level.

I had to accept that the App Service was going to restart things from time to time and ask myself why I cared. The application is supposed to be smart enough to wait for the application pool to warm up before sending requests for it (for example, overlapping reuse). However, my applications will sit there CPU-crunch for 1-2 minutes after disposal.

It took me a while to understand, but the culprit was that all my applications have a rewrite rule to redirect from HTTP to HTTPS. This does not work with the application initialization module: it sends a request to the root file, and it all receives a 301 redirect from the Rewrite URL module, and the ASP.NET pipeline didn’t hit at all, the hard work wasn’t "Actually. App / IIS then thought that it worked the process was ready and then sends traffic to it, but the first "real" request actually follows 301 redirects to the HTTPS and bam URLs! this user gets into the pain of a cold start.

I added the rewrite rule described here to free the application initialization module from the need for HTTPS, so when it gets to the root of the site, it will actually cause the page to load, and therefore the entire pipeline:

 <rewrite> <rules> <clear /> <rule name="Do not force HTTPS for application initialization" enabled="true" stopProcessing="true"> <match url="(.*)" /> <conditions> <add input="{HTTP_HOST}" pattern="localhost" /> <add input="{HTTP_USER_AGENT}" pattern="Initialization" /> </conditions> <action type="Rewrite" url="{URL}" /> </rule> <rule name="Force HTTPS" enabled="true" stopProcessing="true"> <match url="(.*)" ignoreCase="false" /> <conditions> <add input="{HTTPS}" pattern="off" /> </conditions> <action type="Redirect" url="https://{HTTP_HOST}/{R:1}" appendQueryString="true" redirectType="Permanent" /> </rule> </rules> </rewrite>

This is one of many entries in the diary of moving old applications to Azure - you get a lot of things that you can avoid when something works on a traditional virtual machine, which rarely restarts, but it takes a little TLC to work out excesses when moving to our brave new world in the cloud ....

-

UPDATE 10/27/2017:. Following this entry, Azure added a new tool in the Diagnostics and Problem Solving section. Click "Web Application Restarted" and it will tell you the reason, usually because of a delay in storage or infrastructure updates. The foregoing, however, is that when you upgrade to Azure App Service in the best way to promote, you really just have to persuade your application to be comfortable with accidental restarts.

-

UPDATE 11/11/2018: After transferring several legacy systems to one instance of an average application maintenance plan (with a lot of processor and memory resources) I had an unpleasant problem, deployments from intermediate slots would go smoothly, but whenever I boot to the new host due to the maintenance of the Azure infrastructure, everything will work with a delay of 2-3 minutes. I forced myself to guess, trying to understand why this is happening, because the App Service must wait until it receives a successful response from your application before loading you to a new host.

I was so upset that I was ready to classify App Service as a garbage enterprise and return to IaaS virtual machines.

This turned out to be a few problems, and I suspect that others will encounter them by porting their own evil, outdated ASP.NET applications to the App Service, so I thought it was all run.

The first thing to check is that you are really doing the real work in your Application_Start . For example, I use NHibernate, which, although in many cases a pig is enough when loading its configuration, I really create a SessionFactory during Application_Start to make sure that the hard work is done.

The second thing to check, as mentioned above, is that you do not have a rewrite rule for SSL that interferes with validating the App Service warm-up. You can exclude warm-up checks from the rewrite rule as described above. Or, at the time I originally wrote this work, App Service added the HTTPS Only flag, which allows you to redirect HTTPS to the load balancer instead of your web.config file. Since it is processed at the level of indirection over the application code, you do not need to think about it, so I would recommend the HTTPS Only flag as a way of transition.

The third thing to consider is whether you use the App Service local cache option. In short, this is an option where the App Service will copy your application files to the local storage of the instances it runs on, and not to the network share, and is a great option to enable, if your application does not care, loses changes written to the local file system. This speeds up I / O performance (which is important because, remember, the Application Service runs on potatoes ) and eliminates the reboots caused by any maintenance on a network resource. But there is a certain subtlety regarding updates to the App Service infrastructure, which is poorly documented, and you need to be aware. In particular, the Local Cache parameter is initiated in the background in a separate application domain after the first request, and then you switch to the application domain when the local cache is ready. This means that the App Service will receive a warm-up request with your site, receive a successful response, send traffic to this instance, but (screams!) Now the local cache crushes I / O in the background, and if you have many sites in this case, you stop because app I / O is terrible. If you don’t know that this is happening, it looks creepy in the logs because it looks like your application runs twice in the same instance (because it is). The solution is to follow this Jet blog post and create a monitor application initialization clearance page for an environment variable that tells you when the local cache is ready, So you can force the App service to delay loading to a new instance until until the local cache is fully prepared. Here is one that I use to make sure I can talk to the database too:

 public class WarmupHandler : IHttpHandler { public bool IsReusable { get { return false; } } public ISession Session { get; set; } public void ProcessRequest(HttpContext context) { if (context == null) { throw new ArgumentNullException("context"); } var request = context.Request; var response = context.Response; var localCacheVariable = Environment.GetEnvironmentVariable("WEBSITE_LOCAL_CACHE_OPTION"); var localCacheReadyVariable = Environment.GetEnvironmentVariable("WEBSITE_LOCALCACHE_READY"); var databaseReady = true; try { using (var transaction = this.Session.BeginTransaction()) { var query = this.Session.QueryOver<User>() .Take(1) .SingleOrDefault<User>(); transaction.Commit(); } } catch { databaseReady = false; } var result = new { databaseReady, machineName = Environment.MachineName, localCacheEnabled = "Always".Equals(localCacheVariable, StringComparison.OrdinalIgnoreCase), localCacheReady = "True".Equals(localCacheReadyVariable, StringComparison.OrdinalIgnoreCase), }; response.ContentType = "application/json"; var warm = result.databaseReady && (!result.localCacheEnabled || result.localCacheReady); response.StatusCode = warm ? (int)HttpStatusCode.OK : (int)HttpStatusCode.ServiceUnavailable; var serializer = new JsonSerializer(); serializer.Serialize(response.Output, result); } }

Also, do not forget to specify the route and add the initialization of the web.config application:

 <applicationInitialization doAppInitAfterRestart="true"> <add initializationPage="/warmup" /> </applicationInitialization>

The fourth thing to keep in mind is that sometimes the App reloads your app for seemingly junk reasons. It seems that setting the fcnMode property to Disabled might help; this prevents the application from restarting if someone does with configuration files or code on the server. If you use intermediate slots and deploy in this way, this should not bother you. But if you expect that you can connect to FTP and connect to the file and see that this change is reflected in the production process, do not use this option:

  <httpRuntime fcnMode="Disabled" targetFramework="4.5" />

The fifth thing to consider, and this was primarily my problem, is whether you use intermediate segments with the AlwaysOn option enabled. The AlwaysOn option works by pinging your site every minute or so to make sure it is warm, so IIS does not rotate it. Inexplicably, this is not a sticky setting , so you can enable AlwaysOn in both production and intermediate slots so that you do not have to bother with it every time. This causes problems with updating the App Service infrastructure when they load you onto a new host. Here's what happens: let's say you have 7 sites hosted on an instance, each with its own intermediate slot, all with AlwaysOn enabled. The App service initializes the warm-up and applications in your 7 slot machines and dutifully waits for an answer to them before redirecting traffic. But he does not do this for intermediate slots. Thus, it directs traffic to a new instance, but then AlwaysOn kicks in 1-2 minutes in the intermediate slots, so now you have 7 more sites launched at the same time. Remember, the Application Service runs on potatoes , so all of these additional I / O operations will simultaneously destroy the performance of your production slots and will be perceived as downtime.

The solution is to keep AlwaysOn on your intermediate slots so that you are not muffled by this simultaneous frenzy of I / O after updating the infrastructure. If you use a swap script through PowerShell, saving this option "Off in staging, On in production" is surprisingly verbose:

 Login-AzureRmAccount -SubscriptionId {{ YOUR_SUBSCRIPTION_ID }} $resourceGroupName = "YOUR-RESOURCE-GROUP" $appName = "YOUR-APP-NAME" $slotName = "YOUR-SLOT-NAME-FOR-EXAMPLE-STAGING" $props = @{ siteConfig = @{ alwaysOn = $true; } } Set-AzureRmResource ` -PropertyObject $props ` -ResourceType "microsoft.web/sites/slots" ` -ResourceGroupName $resourceGroupName ` -ResourceName "$appName/$slotName" ` -ApiVersion 2015-08-01 ` -Force Swap-AzureRmWebAppSlot ` -SourceSlotName $slotName ` -ResourceGroupName $resourceGroupName ` -Name $appName $props = @{ siteConfig = @{ alwaysOn = $false; } } Set-AzureRmResource ` -PropertyObject $props ` -ResourceType "microsoft.web/sites/slots" ` -ResourceGroupName $resourceGroupName ` -ResourceName "$appName/$slotName" ` -ApiVersion 2015-08-01 ` -Force

This script establishes that AlwaysOn in the intermediate slot, whether the swap occurs so that the stage is performed, and then sets the intermediate segment to AlwaysOn disconnected, so it will not explode after updating the infrastructure.

Once you get started, it's really nice to have a PaaS that handles security updates and hardware crashes for you. But this is a little more difficult to achieve in practice than marketing materials can offer. Hope this helps someone.

Is there a way to determine why Azure App Service rebooted?

More articles: