Pros and cons of logic in SQL?

In the new assignment, I just got acquainted with the concept of introducing logic into SQL statements.

In MySQL, a dumb example would be:

SELECT P.LastName, IF(P.LastName='Baldwin','Michael','Bruce') AS FirstName FROM University.PhilosophyProfessors P // This is like a ternary operator; if the condition is true, it returns // the first value; else the second value. So if a professor last name // is 'Baldwin', we will get their first name as "Michael"; otherwise, "Bruce"** 

For a more realistic example, perhaps you decide whether the seller is suitable for the bonus. You can capture various sales numbers and do some calculations in your SQL query and return true / false as the value of the column called "qualify".

Previously, I would get all the sales data from the request, and then perform the calculation in my application code.

It seems better to me, because if necessary, I can step through the logic of the application step by step using the debugger, but no matter what the database does, this is a black box for me. But I'm a junior developer, so I don’t know what is usual.

What are the pros and cons of having a database server do some of your calculations / logic?

** Sample code based on a Monty Python Sketch.

+7
sql
source share
12 answers

Thus, SQL becomes part of your domain model . This is another (and not necessarily obvious) place where domain knowledge is implemented. Such leaks lead to a closer relationship between the business logic / application code and the database , which is usually a bad idea.

The only exceptions are submissions, query requests, etc. But they are usually so isolated that it is obvious what role they play.

+7
source share

One of the most compelling reasons for logging out of a database is to minimize traffic. In the given example, the gain is small, since you get the same amount of data, regardless of whether the logic is in the request or in your application.

If you want to get only users with the first name Michael , then it makes sense to implement the logic on the server. In fact, in this simple example, it does not really matter, since you can specify users whose name is Baldwin . But consider a more interesting problem in which you give each user a "popularity" rating based on how common their first and last names are, and you want to get the 10 most "popular" users. Calculating the "popularity" in the application would mean that you have to get each individual user before ranking, sorting and selecting them locally. Computing on the server means that you can only get 10 lines for wiring.

+7
source share

There are not many absolute pros and cons in this argument, so the answer is "dependent." Some scenarios with different conditions that affect this decision may be:

Client server application

One example of a place where this might be advisable is the older 4GL or rich client , where all database operations were performed using updates based on stored procedures, inserting, deleting sprocs. In this case, the essence of the architecture was to make sprocs act as the main interface for the database, and all the business logic related to specific objects lived in one place.

This type of architecture is somewhat unfashionable these days, but at some point it was considered the best way to do this. Many VB, Oracle Forms, Informix 4GL, and other client-server applications of the era have been executed this way, and it really works quite well.

This is not without its drawbacks, however - SQL is not particularly good at abstraction, so it’s pretty easy to end up with rather dumb SQL code, which presents a maintenance problem because it is difficult to understand, and not as modular as we would like.

How relevant is it today? Quite often, a rich client is a suitable platform for the application, and, of course, there are many new developments related to Winforms and Swing. Today, we have good open source ORMs where, in the original 1995 Oracle Forms application, it may not have been possible to use this type of technology. However, the decision to use ORM, of course, is not black and white - Fowler Enterprise application architecture templates work well through a series of data access strategies and discussion of their relative merits.

Three-level application with a rich object model

This type of application takes the opposite approach and puts all of the business logic in the object layer of a mid-level model with a relatively thin database layer (or, possibly, a ready-made mechanism such as ORM ). In this case, you are trying to put all the application logic in the middle tier. The data access level has relatively little intelligence, with the possible exception of a few stored procedures necessary to circumvent ORM restrictions.

In this case, SQL-based business logic is minimized because the core storage of the application logic is the middle tier.

Overspeed Batch Processing

If you need to do a periodic run to select records that meet some complex criteria and do something with them, it may be appropriate to implement this as a stored procedure. For something that might have to overcome a large part of a decent size database, a sproc-based approach would probably be the only reasonably effective way to do such things.

In this case, SQL may well be the appropriate way to do this, although traditional 3GLs (specifically COBOL) have been developed specifically for this type of processing. In really large volumes (in particular, mainframes), performing this type of processing using flat or VSAM files outside the database can be the fastest way to do this. In addition, some jobs may be recording and procedural in nature, or they may be much more transparent and reliable if implemented in this way.

In paraphrase Ed Post, "you can write COBOL in any language" - although you may not want to. If you want to store it in a database, use SQL, but this, of course, is not the only game in the city.

Reports

The nature of reporting tools, as a rule, is dictated by the means of coding business logic. Most of them are designed to work with SQL-based data sources, so the nature of the tool makes you choose.

Other domains

Some applications, such as ETL processing, may work well for SQL. ETL tools start getting hassle-free if the conversion becomes too complex, so you might want to use a architecture based on a stored procedure. Blending queries and transforms during retrieval, ETL processing, and stored procedure processing can lead to a transformation process that is difficult to verify and eliminate.

If you have a significant portion of your logic in sprocs, it might be better to put all the logic in this, since it provides a relatively uniform and modular code base. In fact, I have a fairly good authority that approximately half of all data warehouse projects in the banking and insurance sectors are done in this way as an explicit design decision - for this very reason.

+3
source share

Many times the answer to this question will depend on the deployment approach. Where it is most reasonable to place your logic depends on what you need to gain access when making changes.

In the case of non-compiled web applications, it may be easier to deal with changes on the page or file than with working with requests (depending on the complexity of the request, programming experience / experience, etc.). In such situations, scripting logic is generally suitable and simplifies revision later.

In the case of desktop applications, which require more effort to change, placing such logic in a database where it can be adjusted without having to recompile the application can benefit you. If it was decided that people are used to receiving bonuses of 20 thousand, but now they should be 25 thousand, it would be much easier to configure this on SQL Server than, for example, recompiling the accounting application for all your users.

+2
source share

I am a strong supporter of the highest possible logic in the database. This means incorporating it into views and stored procedures. I believe that most follow the DRY principle.

For example, consider a table with FirstName and LastName columns and an application that often uses the FullName field. You have three options:

  • Request the first and last name and calculate the full name in the application code.

  • Query first, last and (first || last) in your SQL application whenever you query a table.

  • Define the CustomerExt view, which includes the first and last columns, and the computed full column of names, and then query that view, not the customer table.

I believe option 3 is clearly correct. Consider adding a MiddleInitial field to a table and calculating the full name. Using option 3, you just need to replace the view, and each application in your company instantly uses the new format for FullName. The view still makes the base columns available for those instances in which you need to make special formatting, but for a standard instance everything works "automatically".

This is a simple case, but the principle is the same for more complex situations. Embed application or enterprise data logic directly into the database, and you don’t have to worry about keeping various applications up to date.

+2
source share

This first example is a bad idea. The functions in the row do not scale as the table becomes larger. In fact, the best way to do this is to index LastName and use something like:

 SELECT P.LastName, 'Michael' AS FirstName FROM University.PhilosophyProfessors P WHERE P.LastName = 'Baldwin' UNION ALL SELECT P.LastName, 'Bruce' AS FirstName FROM University.PhilosophyProfessors P WHERE P.LastName <> 'Baldwin' 

In databases where data is read more often than it is written (and that most of them), these types of calculations should be performed during writing, for example, using the insert / update trigger to populate the real FirstName field.

Databases should be used for storing and retrieving data, and not for bulk computing other than databases, which slow everything down.

+1
source share

The answer depends on your experience and knowledge of your technology. In addition, if you are a technical manager, it depends on your analysis of the skills of the people working in your team and whom you intend to hire / service employees to support, expand and support the application in the future. If you are not literate and own a database (as you do not know), then stick to it in the code. If otoh, you are literate and know the coding of the database (as it should be), then there is nothing wrong (and a lot of right) that it does this in the database.

Two other considerations that may affect your decision are whether the logic is so complex that doing this in the database code will be excessively more complex or abstract than in the code, and secondly, if the involved process requires data database from outside (from some other source). In any of these scenarios, I would like to move the logic to a code module.

+1
source share

That you can easily execute code in your IDE is the only advantage for your post-processing solution. Running logic on the database server often reduces the size of the result sets, resulting in less network traffic. It also allows the query optimizer to get a much better picture of what you really want to do, again often allowing for better performance.

Therefore, I almost always recommend SQL logic. If you consider the database as just a mute repository, it will return the service, behaving mute, and depending on the situation, which can completely kill your productivity - if not today, maybe next year, when everything leaves ...

+1
source share

One big pro: there can be anything you can work with. Reports were mentioned: many reporting tools or report plugins for existing programs allow users to create their own queries (the results of which they will be displayed).

If you cannot change the code (because it does not belong to you), you can still change the request. And in some cases (data transfer), you will also write queries for migration.

0
source share

I like to distinguish data from business rules and to bring data rules as close as possible to stored processes. There is not always a hard and fast difference between them, but in your example of calculating sales bonuses, the formula itself may be a business rule, but the work of collecting and combining the various numbers used in the formula is a data rule.

Sometimes, however, it depends on the deployment model and change management procedures. If the sales formula changes frequently, and the deployment of business-level code is cumbersome, then setting up a single function / stored procedure in the database would be a great solution.

0
source share

I'm not quite sure about this, but I like to put complex write requests in SQL, because then you do not need to worry about locking.

For example, if you cannot record a record with a certain state, the state can be viewed inside

0
source share

I am a big fan of elegant SQL queries, but they are harder to test, especially in the cloud, because you need a database to work.

0
source share

All Articles