What does the value of FabricNotReadableException mean? And how should we respond to this?

We use the following method in a state service on a service basis. The service has sections. Sometimes we get a FabricNotReadableException from this world of code.

public async Task HandleEvent(EventHandlerMessage message) { var queue = await StateManager.GetOrAddAsync<IReliableQueue<EventHandlerMessage>>(EventHandlerServiceConstants.EventHandlerQueueName); using(ITransaction tx = StateManager.CreateTransaction()) { await queue.EnqueueAsync(tx, message); await tx.CommitAsync(); } } 

Does this mean that the partition is down and moving? From this we got into an additional section? Because in some cases, a FabricNotPrimaryException is also thrown.

I saw the MSDN link ( https://msdn.microsoft.com/en-us/library/azure/system.fabric.fabricnotreadableexception.aspx ). But what does

Represents an exception that is thrown when a section cannot receive messages.

mean? What happened that the section cannot accept the reading?

+7
c # azure-service-fabric
source share
2 answers

Under the covers, Service Fabric has several states that can affect whether a given replica can safely serve reading and writing. It:

  • Provided (you can think of it as normal work)
  • Not primary
  • There is no recording quorum (again, it mainly affects recording)
  • Pending reconfiguration

The selected FabricNotPrimaryException can be thrown whenever a write attempt is made on a replica that is not currently Primary and displays the NotPrimary status.

FabricNotReadableException displays other states (you do not need to worry or distinguish between them), and this can happen in a variety of cases. One example is that the replica you are trying to read is a β€œbackup” replica (the replica that was omitted and that was restored, but the replica set already has active replicas). Another example is that the replica is Primary, but closes (for example, due to an update or because of an error message), or if it is currently reconfigured (for example, another replica is added). All of these conditions will result in the replica not being able to satisfy records for a short period of time due to certain security checks and atomic changes that Service Fabric must process under the hood.

You can consider a Retricable FabricNotReadableException. If you see this, just try calling again and, in the end, it will allow either NotPrimary or Granted. If you get a FabricNotPrimary exception, it should usually be returned to the client (or the client somehow notified) that it needs to be re-resolved to find the current Primary (the default communication stacks that service Service Fabric ships observing exceptions that can not be restored, and cancel you on your behalf).

There are two current known issues with FabricNotReadableException.

  • FabricNotReadableException should have two options. The first should be explicitly retriable (FabricTransientNotReadableException), and the second should be a FabricNotReadableException. The first version (Transient) is the most common and probably this is what you encounter, of course, what you would encounter in most cases. The second (non-transitional) will be returned in the event that you end the conversation with the backup. Talking in standby mode will not happen with transport and repeated logic out of the box, but if you have your own opportunity, you may encounter it.
  • Another problem is that today, a FabricNotReadableException should be thrown from a FabricTransientException, which makes it easier to determine the correct behavior.
+10
source share

Posted as an answer (on asnider comment - March 16 at 17:42), because it is too long for comments! :)

I am also stuck in this catch 22. My svc starts up and immediately receives messages. I want to encapsulate the start of a service in OpenAsync and set some ReliableDictionary values, and then start receiving the message. However, Fabric is not readable at the moment, and I need to split this β€œrun” between OpenAsync and RunAsync :(

RunAsync in my service and OpenAsync my client also seems to have different cancellation tokens, so I need to work on how to deal with this. It just all feels a little dirty. I have a number of ideas on how to remove this in my code, but did anyone come up with an elegant solution?

It would be nice if ICommunicationClient had a RunAsync interface that is called when Fabric becomes ready / readable and is canceled when Fabric turns off replica - this would greatly simplify my life. :)

+1
source share

All Articles