Writing a domain-specific language for selecting rows from a table

I am writing a server that I expect will work for many different people, not all of whom I will have direct contact with. The servers will communicate with each other in the cluster. Part of the server’s functionality includes selecting a small subset of rows from a potentially very large table. The exact choice of which rows were selected will require some configuration, and it is important that the person running the cluster (for example, me) update the selection criteria without having to get each server administrator to deploy a new version of the server.

Just writing a function in Python is not really an option, since no one wants to install a server that loads and executes arbitrary Python code at runtime.

I need suggestions on the easiest way to implement a domain-specific language to achieve this. The language should be capable of easily evaluating expressions, as well as querying table indices and iterating over returned rows. The ease of writing and reading the language is secondary to the ease of implementation. I would also prefer not to write a complete query optimizer, so I would explicitly indicate which indexes for the query would be ideal.

The interface that will have to compile will be similar in capabilities to that which exports the App Engine data store: you can query sequential ranges at any index in the table (for example, less, more than, range and equality queries), and then filter the returned a string with any logical expression. You can also combine multiple independent result sets.

I understand that this question sounds the same as I ask SQL. However, I do not want to require that the data warehouse supporting this data be a relational database, and I do not want the overhead of trying to override SQL myself. I also deal with only one table with a known schema. Finally, no connections are required. Something much easier would be much preferable.

: , .

+5
9

DSL Python.

1. . SQL , . Command Strategy . - , - . Apache Ant Task API - .

2. , . , . Command Strategy, Command . Command .

. - , . [ , . , , , " ", .]

, , , . ? DSL , , , Python . .

, . .

Python - " ". , unit test , Python script . Python DSL.

[ " ", : " Python, DSL ". , PYTHONPATH sys.path. site .]

DSL. . Python, , . , Django.

ConfigParser .

JSON YAML . .

XML. , . . , Ant Maven ( ) . , . Python.

.

+4

, . , - .

, , DSL - "SQL". SQL, , .

, , , DSL ( , , ); , , Filter.

, "", SelectionCriterion. , (Range, LessThan, ExactMatch, Like ..). , , , , , , , , - AND OR NOT .

, ; Excel , .

, SQL .

: , , , SQL, WHERE . , FROM, , , , .

+1

" , "

" , Python "

DSL, , Python DSL. . DSL? , Python?

, C, Python? ?

- Python - Python?

+1

, , "", SQL , ?

.

0

Python. Python? - "" DSL, Python.

, , - .

0

, , . , , DSL ( ), , , . - , , DSL , , SQL. , .

, , , -, , API, ( , - ).

, , . , ANTLR Yacc, , ( Lisp/Scheme ). SQL . google 'BNF SQL' , .

.

0

SQL, , , SQLite, ?

0

, , DSL. ANTLR, , . ANTLR Python, SQL, Java, ++, C, # ..

, ANTLR #

0

- , . , , , , - SQLite3.

from functools import partial
def select_keys(keys, from_):
    return ({k : fun(v, row) for k, (v, fun) in keys.items()}
            for row in from_)

def select_where(from_, where):
    return (row for row in from_
            if where(row))

def default_keys_transform(keys, transform=lambda v, row: row[v]):
    return {k : (k, transform) for k in keys}

def select(keys=None, from_=None, where=None):
    """
    SELECT v1 AS k1, 2*v2 AS k2 FROM table WHERE v1 = a AND v2 >= b OR v3 = c

    translates to 

    select(dict(k1=(v1, lambda v1, r: r[v1]), k2=(v2, lambda v2, r: 2*r[v2])
        , from_=table
        , where= lambda r : r[v1] = a and r[v2] >= b or r[v3] = c)
    """
    assert from_ is not None
    idfunc = lambda k, t : t
    select_k = idfunc if keys is None  else select_keys
    if isinstance(keys, list):
        keys = default_keys_transform(keys)
    idfunc = lambda t, w : t
    select_w = idfunc if where is None else select_where
    return select_k(keys, select_w(from_, where))

, . . , , .

ALLOWED_FUNCS = [ operator.mul, operator.add, ...] # List of allowed funcs

def select_secure(keys=None, from_=None, where=None):
    if keys is not None and isinstance(keys, dict):
       for v, fun keys.values:
           assert fun in ALLOWED_FUNCS
    if where is not None:
       assert_composition_of_allowed_funcs(where, ALLOWED_FUNCS)
    return select(keys=keys, from_=from_, where=where)

assert_composition_of_allowed_funcs. python, lisp. , - , , , where=(operator.add, (operator.getitem, row, v1), 2) where=(operator.mul, (operator.add, (opreator.getitem, row, v2), 2), 3).

apply_lisp, , where ALLOWED_FUNCS , float, int, str.

def apply_lisp(where, rowsym, rowval, ALLOWED_FUNCS):
    assert where[0] in ALLOWED_FUNCS
    return apply(where[0],
          [ (apply_lisp(w, rowsym, rowval, ALLOWED_FUNCS)
            if isinstance(w, tuple)
            else rowval if w is rowsym
            else w if isinstance(w, (float, int, str))
            else None ) for w in where[1:] ])

In addition, you will also need to check the exact types, because you do not want your types to be overridden. Therefore, do not use isinstance, use type in (float, int, str). Oh boy we came across:

Greenspun The tenth programming rule: any fairly complex C or Fortran contains special informal information. Bug-ridden slow implementation of half of Common lisp.

0
source

All Articles