Perform mass expression evaluations in IronPython

In a C # -4.0 application, I have a dictionary of strongly typed ILists with the same length - a dynamically strongly typed column-based table. I want the user to provide one or more expressions (python-) based on the available columns that will be aggregated across all rows. In a static context, it will be:

IDictionary<string, IList> table; // ... IList<int> a = table["a"] as IList<int>; IList<int> b = table["b"] as IList<int>; double sum = 0; for (int i = 0; i < n; i++) sum += (double)a[i] / b[i]; // Expression to sum up 

With n = 10 ^ 7, this works for 0.270 seconds on my laptop (win7 x64). Replacing the expression with a delegate for two int arguments, it takes 0.580 seconds, for an untyped delegate 1.19 seconds. Creating a delegate from IronPython using

 IDictionary<string, IList> table; // ... var options = new Dictionary<string, object>(); options["DivisionOptions"] = PythonDivisionOptions.New; var engine = Python.CreateEngine(options); string expr = "a / b"; Func<int, int, double> f = engine.Execute("lambda a, b : " + expr); IList<int> a = table["a"] as IList<int>; IList<int> b = table["b"] as IList<int>; double sum = 0; for (int i = 0; i < n; i++) sum += f(a[i], b[i]); 

3.2 s are required (and 5.1 s with Func<object, object, object> ) - coefficient 4 - 5.5. Is this the expected overhead of what I'm doing? What can be improved?

If I have many columns, the approach chosen above will no longer be sufficient. One solution might be to define the necessary columns for each expression and use only those that are arguments. Another solution that I tried unsuccessfully was to use ScriptScope and dynamic column resolution. For this, I defined a RowIterator that has a RowIndex for the active row and a property for each column.

 class RowIterator { IList<int> la; IList<int> lb; public RowIterator(IList<int> a, IList<int> b) { this.la = a; this.lb = b; } public int RowIndex { get; set; } public int a { get { return la[RowIndex]; } } public int b { get { return lb[RowIndex]; } } } 

ScriptScope script can be created from IDynamicMetaObjectProvider, which, as I expected, will be implemented using C # dynamic, but in runtime engine.CreateScope (IDictionary) tries to call, which fails.

 dynamic iterator = new RowIterator(a, b) as dynamic; var scope = engine.CreateScope(iterator); var expr = engine.CreateScriptSourceFromString("a / b").Compile(); double sum = 0; for (int i = 0; i < n; i++) { iterator.Index = i; sum += expr.Execute<double>(scope); } 

Next, I tried to inherit the RowIterator from DynamicObject and switched to a running example - with terrible performance: 158 seconds.

 class DynamicRowIterator : DynamicObject { Dictionary<string, object> members = new Dictionary<string, object>(); IList<int> la; IList<int> lb; public DynamicRowIterator(IList<int> a, IList<int> b) { this.la = a; this.lb = b; } public int RowIndex { get; set; } public int a { get { return la[RowIndex]; } } public int b { get { return lb[RowIndex]; } } public override bool TryGetMember(GetMemberBinder binder, out object result) { if (binder.Name == "a") // Why does this happen? { result = this.a; return true; } if (binder.Name == "b") { result = this.b; return true; } if (base.TryGetMember(binder, out result)) return true; if (members.TryGetValue(binder.Name, out result)) return true; return false; } public override bool TrySetMember(SetMemberBinder binder, object value) { if (base.TrySetMember(binder, value)) return true; members[binder.Name] = value; return true; } } 

I was surprised that TryGetMember is called with the name of the properties. From the documentation, I would expect TryGetMember to be called only for undefined properties.

Perhaps for reasonable performance I will need to implement IDynamicMetaObjectProvider for my RowIterator in order to use dynamic CallSites, but could not find a suitable example for me. In my experiments, I did not know how to handle __builtins__ in BindGetMember:

 class Iterator : IDynamicMetaObjectProvider { IList<int> la; IList<int> lb; public Iterator(IList<int> a, IList<int> b) { this.la = a; this.lb = b; } public int RowIndex { get; set; } public int a { get { return la[RowIndex]; } } public int b { get { return lb[RowIndex]; } } public DynamicMetaObject GetMetaObject(Expression parameter) { return new MetaObject(parameter, this); } private class MetaObject : DynamicMetaObject { internal MetaObject(Expression parameter, Iterator self) : base(parameter, BindingRestrictions.Empty, self) { } public override DynamicMetaObject BindGetMember(GetMemberBinder binder) { switch (binder.Name) { case "a": case "b": Type type = typeof(Iterator); string methodName = binder.Name; Expression[] parameters = new Expression[] { Expression.Constant(binder.Name) }; return new DynamicMetaObject( Expression.Call( Expression.Convert(Expression, LimitType), type.GetMethod(methodName), parameters), BindingRestrictions.GetTypeRestriction(Expression, LimitType)); default: return base.BindGetMember(binder); } } } } 

I am sure my code above is sub-optimal, at least it does not yet handle IDictionary columns. I would appreciate any advice on how to improve design and / or performance.

+4
source share
2 answers

I also compared IronPython performance with a C # implementation. The expression is simple by simply adding the values ​​of two arrays at the specified index. Access to arrays directly provides a baseline and a theoretical optimum. Accessing values ​​through a character dictionary still has acceptable performance.

The third test creates a delegate from the tree of naive (and bad intent) without any fancy things like call-side caching, but it's still faster than IronPython.

Scripting an expression through IronPython takes the most time. My profiler shows me that most of the time is spent on PythonOps.GetVariable, PythonDictionary.TryGetValue and PythonOps.TryGetBoundAttr. I think there is room for improvement.

Timings:

  • Direct: 00: 00: 00.0052680
  • through the dictionary: 00: 00: 00.5577922
  • Compiled Delegate: 00: 00: 03.2733377
  • Scenario: 00: 00: 09.0485515

Here is the code:

  public static void PythonBenchmark() { var engine = Python.CreateEngine(); int iterations = 1000; int count = 10000; int[] a = Enumerable.Range(0, count).ToArray(); int[] b = Enumerable.Range(0, count).ToArray(); Dictionary<string, object> symbols = new Dictionary<string, object> { { "a", a }, { "b", b } }; Func<int, object> calculate = engine.Execute("lambda i: a[i] + b[i]", engine.CreateScope(symbols)); var sw = Stopwatch.StartNew(); int sum = 0; for (int iteration = 0; iteration < iterations; iteration++) { for (int i = 0; i < count; i++) { sum += a[i] + b[i]; } } Console.WriteLine("Direct: " + sw.Elapsed); sw.Restart(); for (int iteration = 0; iteration < iterations; iteration++) { for (int i = 0; i < count; i++) { sum += ((int[])symbols["a"])[i] + ((int[])symbols["b"])[i]; } } Console.WriteLine("via Dictionary: " + sw.Elapsed); var indexExpression = Expression.Parameter(typeof(int), "index"); var indexerMethod = typeof(IList<int>).GetMethod("get_Item"); var lookupMethod = typeof(IDictionary<string, object>).GetMethod("get_Item"); Func<string, Expression> getSymbolExpression = symbol => Expression.Call(Expression.Constant(symbols), lookupMethod, Expression.Constant(symbol)); var addExpression = Expression.Add( Expression.Call(Expression.Convert(getSymbolExpression("a"), typeof(IList<int>)), indexerMethod, indexExpression), Expression.Call(Expression.Convert(getSymbolExpression("b"), typeof(IList<int>)), indexerMethod, indexExpression)); var compiledFunc = Expression.Lambda<Func<int, object>>(Expression.Convert(addExpression, typeof(object)), indexExpression).Compile(); sw.Restart(); for (int iteration = 0; iteration < iterations; iteration++) { for (int i = 0; i < count; i++) { sum += (int)compiledFunc(i); } } Console.WriteLine("Compiled Delegate: " + sw.Elapsed); sw.Restart(); for (int iteration = 0; iteration < iterations; iteration++) { for (int i = 0; i < count; i++) { sum += (int)calculate(i); } } Console.WriteLine("Scripted: " + sw.Elapsed); Console.WriteLine(sum); // make sure cannot be optimized away } 
+1
source

Although I don't know all the specific details in your case, slowing down just 5 times in order to do anything, this low level in IronPython is actually pretty good. Most entries in the Computer game Benchmark Game show a slowdown of 10-30 times.

The main reason is that IronPython must take into account the possibility that you did something hidden at runtime and thus cannot generate code with the same efficiency.

0
source

Source: https://habr.com/ru/post/1313292/


All Articles