Why are database functions ignored and instead reverted to the middle tier?

What are the main reasons ( besides "database independence" ) that today most IT projects ignore the many features existing in modern DBMSs, such as Oracle 11g and SQL Server 2008?

Or borrow from the Helsinki Declaration blog , which puts it like this:

Over the past twenty years, we have noticed that the functionality (functions) available to us within the DBMS is growing exponentially. These features allowed us to create database applications. This is what we all started to do in the fast-paced nineties.

But then, at the dawn of the new millennium, something happened. And something mysteriously made the DBMS role within the database application project diminutive to insignificant. (...) Starting from the new millennium, we bring all applied logic from the DBMS to mid-level servers. The functionality of the material implemented outside the DBMS exploded, and the functional-functional DBMS was hardly used for anything other than string storage.

We are talking about things like

  • Stored procedures used as data data APIs (to ensure security and prevent excessive network traffic)
  • Materialized views
  • Instead of triggers
  • Hierarchical queries (connect by)
  • Geography (spatial data types)
  • Analytics (lead, lag, rollup, cube, etc.)
  • Virtual Private Database (VPD)
  • Database Level Audit
  • Flashback Requests
  • XML Generation and XSL Transformation in Database
  • HTTP callouts from the database
  • Background Scheduler

Why aren't these features used? Why do most Java, .NET, and PHP developers take the "SELECT * FROM mytable" approach?

+83
database
Sep 02 '09 at 11:17
source share
24 answers

Because stored procedures:

  • add another development language, increasing complexity and potentially redundant code (the logic is written in both languages);
  • usually have worse control, monitoring and debugging capabilities than PHP, C #, Java, Python, etc .;
  • usually less capable than most mid-level languages;
  • only has an advantage when converting large amounts of data (where you avoid server-side fallback), which tends to be minimal actual use.

Therefore, this is the usual methodology for ASP.NET ASP.NET applications.

As Jeff Atwood said, stored procedures are database assembler language , and people are not inclined to be coded in assembly language if they don't need to.

I often used materialized views and sometimes used CONNECT BY in Oracle, none of which, I believe, exist in MySQL.

I am not inclined to use XML / XSLT in the database, because, this means that I use XML and XSLT.

With regard to geographic or spatial data structures, the reason seems to be that they are difficult to just “pick up”. This is a fairly specialized area. I have read the MySQL manual on spatial data structures, and I am sure it makes sense for someone with extensive experience with GIS, but for me and my limited needs (which usually mark the latitude / longitude of a point), it's just doesn It seems worth taking the time to figure it out.

Another problem is that if you go beyond ANSI SQL (a lot), then you have just tied yourself to a specific database provider, and possibly to a specific version. For this reason, you often find that application developers will treat their databases with the lowest common denominator, which means treating them like a dump of relational data.

+55
Sep 02 '09 at 11:26
source share

Because developers do not know about SQL. They rely on DDL and DML created by tools like Hibernate and language constructs like JPA annotations. Developers do not care if they are terribly inefficient because they are mercifully hidden by the usual log levels and because database administrators are not part of the development teams.

That's why I like iBATIS tools. They force you to write and understand SQL, including specific DBMS functions.

+36
Sep 02 '09 at 11:33
source share

I think one of the reasons is fear of the seller.

This is not often said, but the benefits of using vendor-specific features must be weighed against costs. Mostly the costs of rewriting parts that depend on vendor-specific functions for each database that you want to maintain. There is also cost of execution if you are implementing something in a general way when the supplier provides the best way.

I will give this example: it would be possible to find the "lockin" of SQL Server more acceptable if you understand that all Analysis Services, Reporting Services, etc. can do for your application. For large commercial database systems, only the "SQL database engine" needs to be considered.

+21
Sep 02 '09 at 11:31
source share

"Why are database functions ignored."

Since many so-called developers are completely unaware of data management, and even worse, they are completely unaware of their ignorance. "Unskilled and unaware of it," for which it rings.

+17
02 Sep '09 at 17:51
source share

If your software runs on your client hardware, any changes to the database (new stored procedures, updated views, etc.) will require DBA privileges. This is almost always a problem for customers. Engaging a database group complicates any updates that need to be performed. Many great reasons are presented here, but this is the only thing I need to not put the code in the database like the plague.

+12
Sep 02 '09 at 21:09
source share

I think one of the reasons is fear of the seller. These DBMS functions are not standardized - for example, stored procedures are specific to the database, and if you implemented material using stored procedures (instead of, say, web services opened through the middle tier), then you are forever obsessed with the selected DBMS (i.e. if you do not want to spend time / money on re-implementation in another DBMS, if you want to change the DBMS).

+10
Sep 02 '09 at 11:21
source share

MySQL

When web applications exploded in the late 1990s and early 2000s, MySQL was in version 3.3 or 4.0 and did not support anything higher than simple SELECT s. It was, however, free and installed with most Linux distributions. As a result, a generation of programmers did not know about the databases and did not know how to use them.

Even now, when MySQL is at 5.1 and supports most of the features of the commercial system, the same crude old blogs and articles are used as templates when starting a new LAMP project, and MySQL is deployed with MyISAM and 3.3 -era tables.

+8
Sep 02 '09 at 20:59
source share

SQL does not work for the same reason as, for example, Haskell. The metric that determines the success of a language is not purity, but the ease of interpretation by computers, but how difficult it is to maintain the programs written in it.

Ordinary mortals fail even with the simplest language. Maybe 1 out of 10 people have skills in using a simple language such as C #. But out of these 10%, only 1 in 10 or 1% of all people can effectively use languages ​​like SQL or Haskell.

Now SQL is incomplete as a language, in the sense that there are very few things that you can do only with SQL. You always need a different language. What role does this leave for SQL? Developers will understand the benefits of ACID over flat file storages, but besides, there really is nothing for databases to offer them.

The second problem is that SQL is effectively not very compatible with Source Versioning. SQL seems really built on the notion that you will fix everything for the first time. Thus, it is not only bad for developers, but also bad for the development process.

+7
Sep 02 '09 at 11:48
source share

It is easier to fix / reinstall the middle level than the DBMS.

It probably depends on your architecture, but it is our mind. Couple that with the fact that we have one database administrator who is busy and probably paid more than our developers. All developers know SQL, and some of them are semi-formal in procedural languages. If a really hairy performance problem arises, it will be easier and faster for developers to work at an average level than a database, regardless of whether the architecture is better in one form or another.

+7
02 Sep '09 at 14:21
source share

I came across quite a few people who simply did not know that such functions exist - they chopped their teeth in the early days of mySQL, and they never used anything else, and they did not continue to work with promoting new storage tables in mySQL. Or they studied the databases at school, and they never returned to see everything they missed.

They study the minimal SQL query and do not understand all the different extensions offered by various RDBMS.

In one project, I would like to have materialized views ... but I use Postgres. I would like to use spatial data types for another project, but I will have to hack or modify the databases to cope with mySQL support so that they are not null. I even had to figure out how to turn off Oracle transactional consistency to solve long-term queries in OLTP, which would not be a problem in mySQL.

I can normally code the database flaws for this problem, but part of the problem is choosing the right tool for the job - in the current project we spent man-months replicating the data because we use Postgres and they solved Slony-1 before they really knew everything that we would replicate.

... I view this question as “why more people do not use the function x in the language y” - if they are not experts in the language, they may not know that the function x exists.

(and do not take this as support for obtaining a DBA certificate ... I know some Oracle database administrators who could not program their exit from the wet bag, I took all the courses in 8i days, but refused to take the tests because I did not want to to be combined with this group)

+6
Sep 02 '09 at 12:20
source share

Scalability. The more work you give the database server, the more this becomes a bottleneck. It is more scalable to have a whole farm of load-balanced application servers that process data, and just use the database as a persistence store.

+5
Sep 02 '09 at 12:56
source share

I think the biggest reason that eclipses everything else is that relational database systems become significantly more important when multiple applications use the same data. A well-known article by Codd is called "The Relational Data Model for Large General Databanks" (emphasis added)

People tend to think that the application they are writing now will always be controlled by their team; and that it will always satisfy all the needs of people interested in the data generated by the application. If a new need arises, this will be satisfied by adding a new function to an existing application, rather than creating a new application.

But in many cases (not everything, of course, every situation is different), this development model does not work very well in the long run. As the data created by the application accumulates and becomes more important for the business, different people will have interesting ideas on how to use the data. When this happens, if you do not have a relational database management system, you will have a big task.

+5
Sep 02 '09 at 17:23
source share

I had too many situations when corporate policy ("we allowed access to SQL Server, so let's install less powerful DBMSs such as Access to process millions of rows and join millions of rows in another table to automate this import ...") or even a technical policy that might happen ("I know that Access can process this amount of data, and even if we cannot split the MDB into several MDBs and reference them .....")

Ugh. Corporate policy, technical policy or even ignorance prevented me from using many functions.

Another example: I see no reason not to use stored procedures in a 100% Microsoft store, where SQL Server is DBMS of choice. But since the IT guy who was ultimately going to own the solution was "easy" on the SP, I had to resort to other measures. I mean, there is a great example of why some of these "features" were ignored by them in their store.

I know another store that still uses DOS Foxpro 2, because their only IT guy wrote the existing system in this way, and that’s how everything new will evolve. What for? Can we move with time? Many marketing people open several DOS prompts at once, and Foxpro "works" in them to create the ugliest reports I have ever seen. But it works - I will give it to them. It works - they have 12 million rows in their main table and 50 other tables that they “join” to this main table (not all 50 are immediately obvious), but man ... it went well in 1991! They don’t even want to discuss one item from the list of bullets that you indicated in your question.

So that’s why I think.

+5
Sep 03 '09 at 1:00
source share

I would say that the biggest reason is that most people do not know about them. Once someone has figured out a solution to the problem, this will become the default solution for similar ones. SELECT * FROM table has been working for many people for a long time, so they don’t worry about new approaches to old problems.

Another reason is that sometimes writing it in code is much easier than using a database. This is the same idea as folding your own or buying a component from a shelf. Using a pre-written function can solve your problems many times, but from time to time you need to do something that goes beyond the capabilities of what pre-written components can do.

+4
02 Sep '09 at 11:26
source share

Good question and good discussion.

Another way to say: "Why didn’t the object databases get in?" which is the other side of the coin. Databases continue to be an annoying abstraction that still runs in every application, but they are incompatible with the OO logic of modern applications.

In fact, a strange state of things we hide and duplicate the functionality of the database in ActiveRecord, Hibernate and other environments. But this is what happens with the paradigms at the break point (“mismatch of object-relational impedance”). Will we move on to database technologies that are similar to our OO applications (such as object databases)?

The answer is “not for long,” and at the same time expect the DB to be ignored and compressed and used to store only rows in many cases, as the middle tier expands in functionality to bridge the gap.

Another question: "Why should I do this in the database if the middle level can do it?" Intermediate level is familiar and constantly gaining momentum and functionality. Again, we use the middle tier to avoid OO-RDMS mismatch.

+4
Sep 02 '09 at 11:44
source share

To go to Christian , on scalability.

It's just that RDBMs are used more as pure data warehouses, and logic is used for application servers. The optional AS layer gives developers more flexibility than using RDBMS as an application server.

Previously, in the classic days, Fat Apps and Client Server, DB and Application Server were basically the same. You had the application logic embedded in your fat client code, or you returned it to the DBMS. But at that time, the main form of communication was SQL directly in the database.

Other application protocols (CORBA, DCOM, Remote EJB, and the increasingly common XML / JSON / HTTP-RPC style protocols over HTTP) are currently more common. Most databases do not respond directly to these protocols, so the application tier is used to intercept these calls and this tier invokes the database.

But, as we learned, now we get much more flexibility by introducing logic into this layer. A wide selection of tools, great caching flexibility or fault tolerance or even database technology (RDMBS, OODBMS, Document are stored as CouchDB). This “new” 3rd level, despite the added complexity, adds more flexibility and power than the complexity that it introduces.

When your application level is a very thin veneer on top of stored procedures, it really is a question of why it is even there at all.

Using the database and all its functions is an acceptable application strategy even today. SQL Server, Oracle, etc. - These are terribly powerful software components.

Even then, though, the third level is extremely useful in adding flexibility to a modern system.

+4
Sep 02 '09 at 17:59
source share

For me, the reason is not only that my applications are database agnostics, but the database best teaches the core CRUD functions. Yes, databases are highly optimized and may be able to call the HTTP callout, but why do you need to do this? Web service / web application optimized for HTTP calls, not database. Just like an application, it is not intended to connect directly to a data file and receive data. It can be done? Yes, but why? This is not what your application EXCLUDES.

I personally believe that everything you mentioned outside of stored procedures belongs to the application. If you know that your architecture is X, take advantage of the features of X, manually load the database server if necessary, etc. If it can be X or Y (or Z), then your application should be agnostic, if you are not trying to create work security, ensuring that you have to reorganize the application :). I think a little laziness, combined with comfort, may have something to do with it. I know that I would rather do this in C # than SQL if I can. My C # skills are only better.

+3
Sep 02 '09 at 11:31
source share

First, any developer using ORM is naive if S / he believes that using ORM denies the presence of SQL skills. Most ORMs that generate SQL are different from the SQL that is emitted, depending on how the object queries are constructed. The developer will need to parse SQL to see if they need to modify any object queries.

Short answer: many of these functions are not practical for developing OO. I know that database administrators do not like to hear this, but it is true. These functions are good for edge cases, and most good ORMs like N / Hibernate allow you to provide SQL for these cases.

When it comes to most CRUD delegation:

The long answer: I believe that in the world of RDBMS there is an increasing fear of maturity and finds this place in the world. Truth: OOP is older than RDBMS. OOP just gets out of this, increasing pain and maturation. I think SQL as a language is very mature, but the idea that RDBMS should handle is just getting settled. RDBMS was the owner of the business logic for most web applications until Java and C # appeared. I think that we are now beginning to feel this correction.

Speaking of which, I don’t think that any ORM designer will tell you that the quality of the sql statements passed to the DBMS does not matter.

When it comes to non-CRUD I have no answer here. Most stores that I know still use the database for ETL / etc ...

+3
Sep 03 '09 at 2:13
source share

There are not enough developers who know all these functions at a level that will really change the normal "middle level" of a programmer when it comes to implementing the same logic in a database or middle level. Perhaps the only people who really have deep knowledge of these features are database administrators. And they focus on issues other than development. There are more “normal” developers than database administrators. Therefore, it would be very difficult and expensive to find the right people for your team.

Another point is that you, as a rule, only collect in-depth knowledge about one database system, and not all of them. Thus, you may have SQL Server experts or Oracle experts, but not both. This leads (to some extent) to narrow areas of applications where high specialization is calculated. Then the market for such applications is not so big, even if it is there.

+2
Sep 02 '09 at 15:46
source share

I think the reason is the combination of blocking and lack of knowledge for most RDBM users. SQL is a programming language, and it’s much harder to learn the language in which you invoke SQL and SQL than to master one or the other, especially since SQL is a particularly unique language.

The solution, I think, is to abstract the functionality of your database into a utility class and grant ownership of the class to several users who know what they are doing with SQL. This minimizes the risk of blocking the provider (if you switch suppliers, the only thing that overwrites is the class). It also gives non-SQL developers an abstracted interface, so they don’t need to access the database directly.

+1
Sep 02 '09 at 20:46
source share

One of the problems that I observed with increasing the functionality of the database is scaling. It seems like a much more complicated proposal to scale the load on the database and the load on the web server.

Your options are limited, expandable with faster hardware (sometimes with much higher licensing costs) or complex, scalable read-only copies, etc.

If there are performance issues, I want them to be at the web server level. At least one of my options is to add another web server and load balance.

I do not argue with the database-level code in order to minimize the amount of network traffic (records) sent between the web server and the database server. I argue with other features, for example. Advanced processing of business logic at the database level.

0
Sep 03 '09 at 18:04
source share

Several posts point out that it is cheaper to scale at the application level than at the db level.

Another consideration is composite applications that access multiple data stores. It is much easier to write and maintain the language of the platform agnostic language at the application level than separate requests for a specific platform at the db level.

0
Sep 09 '09 at 19:08
source share

Since for writing object-oriented software in the host language, with its own objects in the host language, it is possible to write procedural software.

0
Sep 12 '09 at 15:15
source share

I have always worked on systems that are sold to many customers and work on client hardware. It leads to:

  • You do not know which version of the database software will be launched by the client.
  • Customers may not want to update the database software at your request due to licensing costs or IT policies.
  • , , "" , .
  • , RDBMS .
  • , , PL-SQL TSQL - !
  • ( , ..) ; , .
  • script , DLL-. ( MSI )
  • unit test , .
  • procs, #.
  • , , , . RDBMS , - -.

, , , .

0
01 . '10 16:50
source share



All Articles