Why should I use Amazon Kinesis and not SNS-SQS?

Question

Why should I use Amazon Kinesis and not SNS-SQS?

I have a use case in which there will be a data stream, and I cannot consume it at the same pace and need a buffer. This can be resolved using the SNS-SQS queue. I learned that Kinesis solves the same goal, so what is the difference? Why should I prefer (or should not) Kinesis?

+133

amazon-web-services amazon-sqs amazon-kinesis

Apoorv Oct 29 '14 at 6:01

source share

10 answers

Keep in mind that this answer was correct for June 2015

After some study of the problem, bearing in mind the same question, I found that SQS (with SNS) is preferable for most use cases, unless the order of the messages is important to you (SQS does not guarantee FIFO for messages).

Kinesis has 2 main advantages: (1) you can read the same message from several applications and (2) you can re-read the messages if necessary.

Both benefits can be achieved by using SNS as a fan for SQS. This means that the message producer sends only one message to the SNS. SNS then branches the message into several SQSs, one for each consumer application. Thus, you can get as many consumers as you want, without thinking about how to divide the capacity.

In addition, we added another SQS, which is subscribed to SNS, which will store messages for 14 days. In the usual case, no one reads from this SQS, but in the case of an error that makes us want to rewind the data, we can easily read all the messages from this SQS and resend them to SNS. While Kinesis provides only 7 days of storage.

In conclusion, SNS + SQS is much simpler and provides most of the features. IMO, you need a really strong argument to choose Kinesis.

+69

Roee Gavirel Jun 16 '15 at 10:25

source share

Kinesis supports several consumer features, which means that the same data records can be processed at the same time or at different times for 24 hours for different consumers, this behavior in SQS can be achieved by writing to several queues, and consumers can read from multiple queues. However, when re-recording in several queues in the system, a delay of several seconds (several milliseconds) will be added.

Secondly, Kinesis provides the ability to route for selective recording of route data to various fragments using a partition key that can be processed by specific instances of EC2 and may include micropacket calculation {counting and aggregation}.

Using any AWS software is easy, but with SQS the easiest. With Kinesis, a sufficient number of fragments must be foreseen in advance, dynamically increasing the number of fragments in order to control the load on the spike and reduce in order to save the costs required for handling. it is a pain in kinesis. SQS does not require such things. SQS is infinitely scalable.

+51

kartik Nov 10 '14 at 5:11

source share

The semantics of these technologies are different because they were designed to support different scenarios:

SNS / SQS: items in the stream are not related to each other
Kinesis: items in a stream are connected to each other

Let me understand the difference with an example.

Suppose we have a stream of orders, for each order we need to reserve some stock and schedule delivery. After that, we can safely remove the item from the stream and begin processing the next order. We have fully completed the previous order before proceeding to the next.
Again, we have the same stream of orders, but now our goal is to group orders by destination. When we have, say, 10 orders in one place, we want to deliver them together (delivery optimization). Now the story is different: when we get a new element from the stream, we cannot finish processing it; rather, we are “waiting” for more products to reach our goal. Moreover, if the processor process fails, we must "restore" the state (so that no order is lost).

As soon as the processing of one element cannot be separated from the processing of another, we must have Kinesis semantics for the safe handling of all cases.

+36

Konstantin Triger Nov 02 '17 at 20:58

source share

The biggest advantage for me is the fact that Kinesis is a replay, but SQS is not. Thus, you can have several consumers of the same Kinesis messages (or the same consumer at different times), where with SQS, as soon as the message was confirmed, it disappeared from this queue. Because of this, SQS is better suited for work queues.

+29

Matthew Curry Oct 31 '14 at 19:10

source share

Excerpt from AWS Documentation :

We recommend Amazon Kinesis streams for use with requirements that are similar to the following:
Routing related records with the same recording processor (as with MapReduce streaming). For example, counting and aggregation is easier when all records for a given key are routed to the same write processor.
Ordering records. For example, you want to transfer log data from the application host to the processing / archiving host while maintaining the order of the log statements.
The ability for multiple applications to use the same thread simultaneously. For example, you have one application that updates the dashboard in real time, and another that archives data in Amazon Redshift. You want both applications to consume data from the same stream simultaneously and independently.
Ability to record in the same order after a few hours. For example, you have a billing application and an audit application that runs a few hours after the billing application. Since Amazon Kinesis Streams stores data for up to 7 days, you can run the audit application up to 7 days behind the billing application.
We recommend Amazon SQS for use with requirements similar to the following:
Message semantics (for example, ack / fail at the message level) and visibility timeout. For example, you have a work item queue and you want to track the success of each item independently. Amazon SQS monitors ack / fail, so the application does not need to maintain a constant breakpoint / cursor. Amazon SQS will delete flagged messages and messages indicating a failed update after a configured visibility timeout.
Individual message delay. For example, you have a job queue and you need to schedule separate tasks with a delay. With Amazon SQS, you can configure individual messages to delay up to 15 minutes.
Dynamically increasing concurrency / throughput while reading. For example, you have a work queue and you want to add more readers until the backlog is cleared. With Amazon Kinesis Streams streams, you can scale to enough shards (note, however, that you need to provide enough shards ahead of time).
Using the ASA SAN feature to scale transparently. For example, you buffer requests and load changes as a result of random load spikes or the natural growth of your business. Because each buffered request can be processed independently, Amazon SQS can scale transparently to handle the load without any configuration instructions from you.

+27

cloudtechnician Jul 25 '16 at 11:00

source share

~~Another thing: Kinesis can run a lambda, but SQS can not.~~ So with SQS, you must either provide an EC2 instance for processing SQS messages (and deal with it in the event of a failure), or you should have a scheduled lambda (which does not increase or decrease - you get only once a minute),

Change: this answer is no longer correct. SQS can directly launch lambda from June 2018

https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html

+15

DenNukem Apr 22 '16 at 2:52

source share

Pricing models are different, so one or the other may be cheaper depending on your use case. Using the simplest case (not including SNS):

SQS charges per message (each 64 KB counts as one request).
Kinesit expenses for a splinter per hour (1 splinter can process up to 1000 messages or 1 MB / s), as well as for the amount of data that you enter (every 25 KB).

Including current prices and the lack of free level accounting, if you send 1 GB of messages per day at the maximum message size, Kinesis will cost much more than SQS ($ 10.82 per month for Kinesis versus $ 0.20 per month for SQS). But if you send 1 TB per day, Kinesis is slightly cheaper ($ 158 / month versus $ 201 / month for SQS).

Details: SQS charges $ 0.40 per million requests (64 KB each), so $ 0.00655 per GB. At 1 GB per day, this is less than $ 0.20 per month; at 1 TB per day, that's a little over $ 201 per month.

Kinesis charges $ 0.014 per million requests (25 KB each), so $ 0,00059 per GB. At 1 GB per day, this is less than $ 0.02 per month; at 1 TB per day, that's about $ 18 a month. However, Kinsis also charges $ 0.015 per hour. You need at least 1 shard per 1 MB per second. With 1 GB per day, 1 shard will be a lot, so it will add another $ 0.36 per day, the total cost is 10.82 US dollars per month. At 1 TB per day, you will need at least 13 shards, which adds another 4.68 dollars per day, the total cost is 158 US dollars per month.

+9

John Velonis Sep 18 '17 at 20:52 on

source share

Kinesis solves the problem of part of the map in a typical scenario of reducing the map for streaming data. Although SQS does not do this. If you have streaming data that needs to be aggregated by key, kinesi ensures that all data for that key goes to a specific splinter, and the splinter can be consumed on one host, which simplifies key aggregation compared to SQS

+7

bhanu tadepalli Nov 19 '15 at 7:21

source share

I will add one more thing that no one has mentioned yet - SQS is several orders of magnitude more expensive.

+3

Eugene Feingold Dec 03 '15 at 19:43

source share

EJ Brennan · Accepted Answer · 2014-10-29 09:34

On the surface, they are vaguely similar, but your use case will determine which tool is suitable. IMO, if you can get through SQS, then you should - if it does what you want, it will be simpler and cheaper, but here is the best explanation from the AWS FAQ, which provides examples of suitable use cases for both tools: help you solve:

FAQ

Why should I use Amazon Kinesis and not SNS-SQS?

More articles: