Kafka as a data warehouse for future events

I have a Kafka cluster that receives messages from a source based on data changes in that source. In some cases, messages are intended to be processed in the future. Therefore, I have 2 options:

  • Spend all messages and send messages intended for the future to Kafka under a different topic (with a date in the topic name) and have a Storm topology that searches for topics with this name. This ensures that messages are processed only on the day it was intended for.
  • Store it in a separate database and create a scheduler that reads messages and messages in Kafka only on this future date.

Option 1 is easier to execute, but my question is: is Kafka a reliable data warehouse? And has anyone done similar events with Kafka? Are there any holes in the design?

0
source share
1 answer

You can set the time that your messages remain in Kafka (log.retention.hours).

, Kafka " " , . , Kafka + Storm . - (MapReduce, Spark...) ?

0

All Articles