What is the difference between checking and querying in dynamodb? When using scan / query?

The query operation specified in the DynamoDb documentation:

The query operation searches only the attribute values ​​of the primary key and supports a subset of the comparison operators by key attribute values ​​to refine the search process.

and scan operation:

A scan operation scans the entire table. You can specify filters to apply to the results to refine the values ​​returned to you after a full scan.

This is best based on performance and cost characteristics.

+36
amazon-dynamodb
source share
4 answers

You have the key / primary key of the Dynamodb table partition as customer_country . If you are using a request, then the customer_country field is required to complete the request operation. Of all filters, only those elements that belong to customer_country can be made.

If you scan the table, the filter will be executed for all keys in the partition / primary key. First, it extracts all the data and applies the filter after retrieving from the table.

eg:

here customer_country is the partition key / primary key, and id is the sort_ key

 ----------------------------------- customer_country | name | id ----------------------------------- VV | Tom | 1 VV | Jack | 2 VV | Mary | 4 BB | Nancy | 5 BB | Lom | 6 BB | XX | 7 CC | YY | 8 CC | ZZ | 9 ------------------------------------ 
  • If you perform a query operation, it only applies to customer_country . The value should only be equal to the operator (=).

  • Thus, only items equal to this partition key / primary key value are selected.

  • If you perform a scan operation, it selects all the items in this table and filters out the data after it receives this data.

Note: do not perform a scan operation; it exceeds your RCU.

+25
source share

When creating the Dynamodb table, select Primary Keys and Local Secondary Indexes (LSI) so that the Query operation returns the items you need.

The query operations support only the same evaluation of the primary key operator, but conditional (=, <, <=,>,> =, Between, Begin) for the sort key.

Scanning operations are usually slower and more expensive, since the operation must go through each element in your table in order to get the requested elements.

Example:

 Table: CustomerId, AccountType, Country, LastPurchase Primary Key: CustomerId + AccountType 

In this example, you can use the Query operation to get:

  1. CustomerId with conditional filter for AccountType

The scan operation should be used to return:

  1. All clients with a specific account type
  2. Elements based on conditional filters by country, i.e. all customers from the USA
  3. Items based on LastPurchase conditional filters, i.e. all customers who made a purchase last month

To avoid scanning operations for frequently used operations, create a local secondary index (LSI) or global secondary index (GSI).

Example:

 Table: CustomerId, AccountType, Country, LastPurchase Primary Key: CustomerId + AccountType GSI: AccountType + CustomerId LSI: CustomerId + LastPurchase 

In this example, the Query operation may allow you to get:

  1. CustomerId with conditional filter for AccountType
  2. [GSI] Conditional filter by CustomerIds for a specific AccountType
  3. [LSI] CustomerId with conditional filter on LastPurchase
+18
source share

In terms of performance, I think that it’s good practice to design a table for applications that use Query instead of Scan . Since a scan operation always scans the entire table before it filters out the desired values, this means that it takes more time and space to process operations such as reading, writing, and deleting. For more information, please refer to the white paper

+5
source share

The query is much better than scanning - in terms of performance. scan, as the name implies, will scan the entire table. But you should know the table key, the sort key, the indexes, and the related sort indexes well, to know that you can use the query. if you filter your request using:

  • key
  • key and sort key
  • index
  • Index and associated sort key

use the request! otherwise, use a scan that is more flexible as to which columns you can filter.

You CANNOT Ask if:

  • more than 2 fields in the filter (for example, key, sort and index)
  • sort key only (primary key or index)
  • regular fields (not key, index or sort)
  • mixed index and sort (index1 with sort index2) \
  • ...

good explanation: https://medium.com/@amos.shahar/dynamodb-query-vs-scan-sql-syntax-and-join-tables-part-1-371288a7cb8f

0
source share

All Articles