Provide Only Reasonable Requests

In our organization, we need to allow employees to filter data in our web application by providing WHERE clauses. It worked great for a long time, but we sometimes come across users who provide queries that require full table scans on large tables or inefficient joins, etc.

Some clowns may write something like:

select * from big_table where Name in (select name from some_table where name like '%search everything%') or name in ('a', 'b', 'c') or price < 20 or price > 40 or exists (select 1 from some_other_table where col1 + col2 + col3 = 4) or exists (select 1 from table_a, table+b) 

Obviously, this is a great way to query these tables with computed values, unindexed columns, an OR set, and an unlimited join on table_a and table_b.

But for the user, this may make general sense.

So, what is the best way, if any, to allow internal users to submit a query to the database, ensuring that it will not block dozens of tables and hang up the web server for 5 minutes?

I assume that there is a programmatic way in C # / sql-server to get a query execution plan before it starts. And if so, what factors contribute to value? Estimated I / O Cost? Estimated processor cost? What would be the reasonable limits under which the user could say that his request is not suitable?

EDIT: We are a market research company. We have thousands of surveys, each of which has its own data. We have dozens of researchers who want to cut this data in an arbitrary way. We have tools that allow them to create “valid” filters using a graphical interface, but some “advanced users” want to provide their own queries. I understand that this is not standard or best practice, but how else can I allow dozens of users to query tables for the rows they want using arbitrarily complex conditions and constantly changing conditions?

+4
source share
11 answers

You can try using the following:

 SET SHOWPLAN_ALL ON GO SET FMTONLY ON GO <<< Your SQL code here >>> GO SET FMTONLY OFF GO SET SHOWPLAN_ALL OFF GO 

Then you can make out what you have. As for where to draw a line on various things, this will have some experience. There are some things to watch out for, but nothing to cut and dry. This is often more of an art for studying query plans than for science.

As others have noted, I think your problem goes deeper than the technological implications. The fact that you allow unskilled people to access your database in this way is a major problem. From past experience, I often see this in companies where they are too lazy or too inexperienced to properly capture their application requirements. I am not saying that this necessarily applies to your corporate environment, but that is what I saw.

+3
source

The assumption of your question says:

In our organization, we need to allow employees to filter the date in our web application by submitting WHERE clauses.

I find this premise spoiled on the face. I cannot imagine a situation where I allow users to do this. In addition to the problems that you have already identified, you open up SQL Injection attacks.

I would strongly recommend re-evaluating your requirements to make sure you cannot create a safer, more focused way to allow users to search.

However, if your users are really complex (and trusted!) Enough to directly offer WHERE clauses, they need to be trained in what they can and cannot imagine as a filter.

+5
source

Also, in order to try to control what the user enters (this is a losing battle, there will always be a new hire that will come up with an impressive query), I would look in the Resource Governor, see SQL Server workload management using Resource Governor . You put special requests in a separate pool and close the allocated resources. In this way, you can mitigate this problem by limiting the amount of damage that a bad request can perform with other tasks.

And you should also consider providing data access in other ways, such as Power Pivot, and letting users massage their data as much as they want in their own Excel. Business users like this, and the impact on the transaction processing server is minimal.

+2
source

Instead of allowing employees to directly write (add) queries, and then try to calculate the cost of the query before running it, why not create some advanced search or filter function that DOES NOT write SQL that you cannot control?

+1
source

In very large enterprises, internal use is a common practice. Often during the design phase, you restrict the criteria or set reasonable limits on the data ranges, but as soon as the business takes possession of this application, calls to the leadership of the business unit to eliminate the restrictions will be eliminated. In my origin, this is a management problem, not an engineering problem.

We made a profile of all the criteria and found the largest intruders, both users and the types of requests that caused most problems, and limited some of the requests. Also, some very expensive queries were added to the application, which were used on a regular basis, and the application cached the results and launched queries at low load. We also created optimized queries for standard users and provided only these users with the ability to search for anything. Just a few ideas.

+1
source

You can create a data model for your database and allow users to use the SQL Reporting Services Report Builder. Its GUI-based and does not require writing WHERE clauses, so there must be a limit to how much damage they can do.

Or you can store a copy of db for user requests, update db every hour or so and release them to the city ... :)

+1
source

I worked in several places where this also came. What we have finished is NOT to allow users unlimited access and promise that IT will do its best to provide requests when necessary. The problem was that the database is quite complex, and even if users can write grammatically and syntactically correctly SQL, they do not necessarily understand the relationship between tables. In other words, even if they could write their own SQL, they would get the wrong answers. We convinced users that the risk of making a wrong decision based on an erroneous or incomplete understanding of 200 tables in the database was too high. It is better to get the right answer every other day than not the one that is instant.

Another part of this is what IT does when user A writes a request and receives 1 response, then user B writes that he thinks is the same request and receives a different response? Is it an IT job to find the differences? Fix both parts of SQL? etc. The bottom line is that I would not allow them to access. I would boot the system with predefined queries, as others have mentioned, and try to teach mgmt why this is the only way to work in the end.

+1
source

If you have so much data and want to give your customers the ability to analyze and view information as they wish, I highly recommend the thing about OLAP .

+1
source

Guess you've never heard of SQL Injection attacks? What if the user issues the DROP DATABASE command after the WHERE clause?

0
source

For this reason, direct SELECT permission is almost never granted to users in the vast majority of applications.

It would be much better to design your application around use cases so that you can cover a reasonable percentage of the requirements with specially designed filter / aggregation / layout options.

There are many ways to do this, so some analysis of your specific problem domain will definitely be required along with research on viable methods.

While direct access to SQL is the most flexible for your users, long executing queries are likely to be just the beginning of your headaches. SQL injection is a big concern, regardless of whether the source is malicious or simply erroneous.

0
source

(Chad mentioned this in a comment, but I think he deserves an answer.)

You may need to copy the data you need to query ad-hoc into a separate database in order to isolate any problems from most users.

0
source

All Articles