How to reduce Azure table storage latency?

I have a pretty huge (30 million rows, up to 5-100Kb each) table on Azure.
Each RowKey is a Guid and PartitionKey is the first part of a Guid, for example:

 PartitionKey = "1bbe3d4b" RowKey = "1bbe3d4b-2230-4b4f-8f5f-fe5fe1d4d006" 

The table has 600 records and 600 records (updates) per second with an average delay of 60 ms . All queries use both PartitionKey and RowKey .
BUT, some reads take up to 3000 ms (!). On average,> 1% of all reads take more than 500 ms, and there is no correlation with the size of the object (a line of 100 Kb can be returned in 25 ms and 10 Kb alone in 1500 ms).

My application is an ASP.Net MVC 4 website running on 4-5 large instances.

I have read all of the MSDN articles regarding Azure Table Storage performance goals and have already done the following:

  • UseNagle disabled
  • Expect100Continue also disabled.
  • MaxConnections for the client table is set to 250 (setting 1000-5000 makes no sense)

I also checked that:

  • Storage account monitoring counters do not have damping errors.
  • There are some kind of β€œwaves” in performance, although they are not load dependent

What could be causing such performance problems and how to improve it?

+5
source share
2 answers

I use the MergeOption.NoTracking parameter on the DataServiceContext.MergeOption for extra performance if I am not going to update the object any time soon. Here is an example:

 var account = CloudStorageAccount.Parse(RoleEnvironment.GetConfigurationSettingValue("DataConnectionString")); var tableStorageServiceContext = new AzureTableStorageServiceContext(account.TableEndpoint.ToString(), account.Credentials); tableStorageServiceContext.RetryPolicy = RetryPolicies.Retry(3, TimeSpan.FromSeconds(1)); tableStorageServiceContext.MergeOption = MergeOption.NoTracking; tableStorageServiceContext.AddObject(AzureTableStorageServiceContext.CloudLogEntityName, newItem); tableStorageServiceContext.SaveChangesWithRetries(); 

Another problem may be that you are extracting all enity with all its properties, even if you intend to use only one or two properties - this, of course, is wasteful, but cannot be easily fixed. However, if you use Slazure , you can only use query predictions to retrieve the properties of the object that interest you from the table storage and nothing else that will give you better query performance. Here is an example:

 using SysSurge.Slazure; using SysSurge.Slazure.Linq; using SysSurge.Slazure.Linq.QueryParser; namespace TableOperations { public class MemberInfo { public string GetRichMembers() { // Get a reference to the table storage dynamic storage = new QueryableStorage<DynEntity>("UseDevelopmentStorage=true"); // Build table query and make sure it only return members that earn more than $60k/yr // by using a "Where" query filter, and make sure that only the "Name" and // "Salary" entity properties are retrieved from the table storage to make the // query quicker. QueryableTable<DynEntity> membersTable = storage.WebsiteMembers; var memberQuery = membersTable.Where("Salary > 60000").Select("new(Name, Salary)"); var result = ""; // Cast the query result to a dynamic so that we can get access its dynamic properties foreach (dynamic member in memberQuery) { // Show some information about the member result += "LINQ query result: Name=" + member.Name + ", Salary=" + member.Salary + "<br>"; } return result; } } } 

Full disclosure: I encoded Slazure.

You can also consider pagination if you are retrieving large datasets, for example:

 // Retrieve 50 members but also skip the first 50 members var memberQuery = membersTable.Where("Salary > 60000").Take(50).Skip(50); 
+1
source

Usually, if a particular query requires scanning a large number of rows, it will take longer. Is the behavior a specific query / data? Or do you see that performance depends on the same data and query?

0
source

All Articles