In our organization, we need to allow employees to filter data in our web application by providing WHERE clauses. It worked great for a long time, but we sometimes come across users who provide queries that require full table scans on large tables or inefficient joins, etc.
Some clowns may write something like:
select * from big_table where Name in (select name from some_table where name like '%search everything%') or name in ('a', 'b', 'c') or price < 20 or price > 40 or exists (select 1 from some_other_table where col1 + col2 + col3 = 4) or exists (select 1 from table_a, table+b)
Obviously, this is a great way to query these tables with computed values, unindexed columns, an OR set, and an unlimited join on table_a and table_b.
But for the user, this may make general sense.
So, what is the best way, if any, to allow internal users to submit a query to the database, ensuring that it will not block dozens of tables and hang up the web server for 5 minutes?
I assume that there is a programmatic way in C # / sql-server to get a query execution plan before it starts. And if so, what factors contribute to value? Estimated I / O Cost? Estimated processor cost? What would be the reasonable limits under which the user could say that his request is not suitable?
EDIT: We are a market research company. We have thousands of surveys, each of which has its own data. We have dozens of researchers who want to cut this data in an arbitrary way. We have tools that allow them to create “valid” filters using a graphical interface, but some “advanced users” want to provide their own queries. I understand that this is not standard or best practice, but how else can I allow dozens of users to query tables for the rows they want using arbitrarily complex conditions and constantly changing conditions?