The easiest way to register all messages from the Azure Event Hub

I am using a service that outputs to an event hub.

We want to save this output, which will be read once a day using a batch job running on Apache Spark. Basically, we calculated, we just get all the messages dropped into the drops.

What is the easiest way to capture messages from an event hub in a Blob repository?

Our first thought was the work of Streaming Analytics, but it requires the analysis of raw messages (CSV / JSON / Avro), our current format is nothing.


Update We resolved this issue by changing the message format. Anyway, I would like to know if there is any low-return method for storing messages in blobs. Did EventHub have a solution for this before streaming analytics arrived?

+5
source share
4 answers

You can use event-hubs-capture to capture in blob.

+2
source

You can write your own workflow to read messages from EventHub and save them in the blob repository. You do not need to do this in real time, as messages on the EH remain beyond the set retention days. The client that reads the EH is responsible for managing message processing, tracking the message and the offset of the EH messages. There is a C # library that makes this very easy and scales very well: https://azure.microsoft.com/en-us/documentation/articles/event-hubs-csharp-ephcs-getstarted/

+5
source

Azure now has this built-in module: Event Hub Archive (in preview)

+1
source

You can also do this through the Azure Function (serverless code) that runs from the Event Hub trigger.

Depending on your requirements, this may work better than the event capture function if you need an opportunity that it does not have, such as saving to GZIP or writing to a more ordinary blob virtual directory structure.

https://docs.microsoft.com/en-us/azure/azure-functions/functions-bindings-event-hubs#trigger-usage

+1
source

All Articles