Custom ABAP Performance

Are there general ABAP-specific tips related to running large SELECT queries?

In particular, is it possible to close the question about FOR ALL ENTRIES IN vs JOIN once and for all?

+4
source share
8 answers

A few (more or less) ABAP-specific hints:

Avoid SELECT * where it is not needed , try to select only those fields that are needed. Reason: each value may be displayed several times during the process (DB Disk → DB Memory → Network → DB Driver → ABAP internal). It's easy to save processor cycles if you still don't need fields. Be very careful if you CHOOSE * a table containing BLOB fields, such as STRING, this can completely kill your database performance, because the contents of the blob are usually stored on different pages.

Do not select SELECT ... ENDSELECT for small to medium result sets , use SELECT ... INTO TABLE instead. Reason: SELECT ... INTO TABLE performs a single selection and does not hold the cursor open, while SELECT ... ENDSELECT usually displays one row for each iteration of the loop.

It was a kind of urban myth - there is no performance penalty when using SELECT as a loop statement. However, this will keep the cursor open during the loop, which can lead to undesirable (but not strictly performance-related) effects.

For large result sets, use the cursor and the internal table. Reason: Same as above and you will avoid too much heap space.

Do not use ORDER BY, use SORT instead. Reason: better scalability of the application server.

Be careful with nested SELECT statements. Although they can be very convenient for small "internal result sets", they represent tremendous performance if a subquery returns a large result set.

Measure, measure, measure Never take anything if you are concerned about performance. Create a representative test data set and run tests for different implementations. Learn how to use ST05 and SAT.

It is impossible to close the second question “once and for all”. First of all, FOR ALL ENTRIES IN combines a database table and an internal (memory) table, while JOIN only works with database tables. Since the database does not know anything about ABAP internal memory, the FOR ALL ENTRIES IN statement will be converted to a set of WHERE statements - try using ST05 to track this. Secondly, you cannot add values ​​from the second table when using FOR ALL ENTRIES IN. Third, keep in mind that FOR ALL ENTRIES IN always implies DISTINCT. There are a few pitfalls - be sure to check out the ABAP online link, all of which are listed there.

If the number of records in the second table is small, both operators should be more or less equal in performance - the database optimizer should simply pre-select all the values ​​from the second table and use the smart join algorithm to filter through the first table. My recommendation: use whatever is good Do not try to set your code to illegibility.

If the number of records in the second table exceeds a certain value, Bad Things [TM] occurs with FOR ALL ENTRIES IN - the contents of the table are divided into several sets, then the query is converted (see above) and restarted for each set.

+13
source

Another note: the “Avoid SELECT *” statement is generally true, but I can tell you where it is false.
If you intend to use most of the fields anyway and where you have several requests (in the same program or different programs that can run at about the same time) that occupy most of the fields, especially if they are different fields that are missing.

This is because application server data buffers are based on a request selection signature. If you make sure that you are using the same query, you can make sure that the buffer can be used instead of reusing the database. In this case, SELECT * is better than selecting 90% of the fields, because you make buffer use much more likely.

Also note that with the last version I tested, the ABAP DB level was not smart enough to recognize SELECT A, B as the same as SELECT B, A, which means you should always put fields that you take in the same order (preferred table order) to make sure again that the data buffer in the application is well used.

+5
source

I usually follow the rules outlined in this pdf from SAP: "Efficient Database Programming with ABAP" It shows lots of tips for query optimization.

+3
source

This question will never be fully answered.

The ABAP operator for accessing the database is interpreted several times by various components of the entire system (SAP and DB). The behavior of each component depends on the component itself, its version and settings. The bulk of the interpretation is done in the DB adapter by SAP.

The only viable approach to achieve maximum performance is a measurement for a specific system (SAP version and vendor and database version).

+2
source

Transaction SE30 has fairly extensive tips and tricks. It even allows you (depending on authorization) to write your own code fragments and measure it.

Unfortunately, we cannot close the discussion “for all records” and join the discussion, since it very much depends on how your landscape is configured, which database server you use, the effectiveness of your table indexes, etc.

A simplified answer allows the database server to do as much as possible. For the question “for all entries” and “attach”, this means merging. In addition, every experienced ABAP programmer knows that he is never so simple. You have to try different scenarios and measure, as vwegert said. Also, do not forget to measure your living system, as sometimes the hardware configuration or data set differs significantly from completely different results in your living system than the test.

+1
source

I usually follow the following conventions:

  • Never do select * , select only the fields you need.
  • Never use "in the appropriate table" instead of creating local structures that have all the necessary fields.
  • In the where clause, try to use as many primary keys as possible.
  • If you have selected to retrieve a single record and all primary keys are included in the where clause, use Select single , or use SELECT UP TO TO 1 ROWS, ENDSELECT .
  • Try using Join statements to FOR ALL ENTRIES tables instead of using FOR ALL ENTRIES .
  • If all the records cannot be avoided, make sure the internal table is not empty and delete duplicate records for better performance.
0
source

Two more questions in addition to the other answers:

  • usually you use a JOIN for two or more tables in a database, and you use FOR ALL ENTRIES IN to join the database tables with a table in memory. If you can, JOIN .

  • usually an IN statement is more convenient than FOR ALL ENTRIES IN . But the kernel translates IN into a long select statement. The length of such a statement is limited, and you get a dump when it gets too long. In this case, you are forced to use FOR ALL ENTRIES IN , despite the performance implications.

0
source

With in-memory database technologies, it is best to complete all data and calculations on the database side using JOIN and database aggregation functions such as SUM.

But if you cannot, at least try to avoid accessing the database in loops. Also, of course, avoid reading the database without using indexes.

0
source

All Articles