Trimming a 3-dimensional array request based on the data found

I have a service that retrieves data for the following

Func(List<symbols>,List<fields>,StartDate,EndDate) 

It will return a 3-dimensional array of values.

 Sym1 field1 field2 field3 Sym2 field1 field2 field3 Date1 Date2 Date3 

That is, the x axis is fields, the y axis is symbols, and the z axis is dates.

I also have a cache of some of the above values ​​(which I received earlier) in dictionary format

 <Date<Symbol<field,value>>> 

The service charges money based on each of the data points that it returns. So, if we have 3 characters, 4 fields and 2 data dates, then we will get a fee for 24 points.

I need to break the original large requests into several smaller requests only for data not found in the cache.

Eg. If I have the original query for 5 characters A, B, C, D, E and 4 fields F1, F2, F3, F4 for 3 dates D1, D2, D3.

 A,B,C,D,E F1,F2,F3,F4 D1,D2 

Assuming the cache already has data for the following fields

 B,F2,D2 C,F4,D1 

Then the subsequent requests that I will make for maintenance, if they are optimized and broken, will be

 Request1 A,B,C,D F1,F3 D1,D2 Request2 A,D F2,F4 D1,D2 Request3 B F2,F4 D1 Request4 B F4 D2 Request5 C F2,F4 D2 Request6 C F2 D1 

Is there some standard way to split this into smaller queries / a 3D array. What is the best way to achieve this? What data structure will fit my needs?

+6
source share
3 answers

Your problem can be visualized as follows:

Problemm visualization

You have a dense three-dimensional grid, where some cells (gray) are already occupied by the cache. The challenge is to find the minimum number of cuboids to fill the free space. However, your case is slightly different in that the axes for characters and fields are not ordered.

I have the feeling that there is no polynomial solution for this task. Therefore, if you really need to find the optimal solution, there is a chance that you need to look for all the space for the solution (for example, track feedback). Here is the idea of ​​an approximative greedy algorithm:

 for each iSymbol for each iField for each iDate { if(values[iSymbol, iField, iDate] != null) //already filled continue; set<int> symbols = {iSymbol}; //the symbols in the current cuboid set<int> fields = {iField}; //the fields in the current cuboid int maxDate = iDate; //the maximum date index bool dateAxisFinished = false; bool symbolAxisFinished = false; bool fieldAxisFinished = false; for(int i = 0; i < 3; ++i) //extend along all three axis { //check which axis allows the greatest extension int extDate; if(!dateAxisFinished) extDate = checkExtensionDate(iDate, symbols, fields); set<int> extSymbols; if(!symbolAxisFinished) extSymbols = checkExtensionSymbol(iDate, maxDate, iSymbol, fields); set<int> extFields; if(!fieldAxisFinished) extFields = checkExtensionField(iDate, maxDate, symbols, iField); } if(!dateAxisFinished && extDate-iDate+1 >= extSymbols.size && extDate-iDate+1 >= extFields.size) { //fix this extension maxDate = extDate; dateAxisFinished = true; } else if(!symbolAxisFinished && extSymbols.size >= extFields.size) { symbols = extSymbols; symbolAxisFinished = true: } else { fields = extFields; fieldAxisFinished = true; } } perform a query for symbols, fields from iDate to maxDate and put result into values } // ----------------------- //returns the maximum date index that can be included in the current cuboid int checkExtensionDate(int dateFrom, set<int> symbols, set<int> fields) { for iDate from dateFrom + 1 to maxDate for each iSymbol in symbols for each iField in fields if(values[iSymbol, iField, iDate] != null return iDate - 1; } //returns the maximum set of symbols that can be included in the current cuboid set<int> checkExtensionSymbol(int dateFrom, int dateTo, int startSymbol, set<int> fields) { set<int> result = { startSymbol }; for each iSymbol in allSymbols \ { iSymbol } { bool symbolOk = true; for each iDate from dateFrom to dateTo { if(!symbolOk) break; for each iField in fields { if(!symbolOk) break; if(values[iSymbol, iField, iDate] != null symbolOk = false; } } if(symbolOk) result.add(iSymbol); } return result; } //similar method for fields 

This is just a basic idea and may require some improvements.

+3
source

Proposed Approach:

  • Create a result data structure before calling the fetch API service

  • Fill out the results structure using data from the cache for what is available.

  • Call the external service / API for values ​​that are not populated (using the result structure).

Fought, everything is ready. For step # 3, you can use Linq to identify the empty slots to be filled.

+2
source

Preamble

It is hard to understand exactly what you want. So sorry if I understood something was wrong.

Answer

For the answer, I assume that (this is important) :

  • Dates are reserved. So, if you request a service for 1 character, 1 field within 3 days (i.e. A, F1, 2014 May 01 - May 03), then you will receive payment for 3 points.
  • You do not pay for inquiries. That is, you will be charged for:

     A, F1, 2013 May 01 β€” May 03 (3 points) 

    and

     A, F1, 2013 May 01 (1 point) A, F1, 2013 May 02 (1 point) A, F1, 2013 May 03 (1 point) (same 3 points charged totally) 

Assuming this, the code will be simple and minimize your bills: D

 // Replace Field, Symbol and SomeType with actual types you use for fields, symbols and values. SomeType? GetCachedData(Tuple<Field, Symbol, DateTime> point) { //your caching code here } void CacheData(Tuple<Field, Symbol, DateTime> point, SomeType value) { //your caching code here } SomeType GetDataFromService(Tuple<Field, Symbol, DateTime> point) { //your service requesting code here } Tuple<Field, Symbol, DateTime, SomeType>[] GetData(IEnumerable<Field> fields, IEnumerable<Symbol> symbols, IEnumerable<DateTime> dates) { var result = new List<Tuple<Field, Symbol, DateTime, SomeType>>(); foreach (var field in fields) foreach (var symbol in symbols) foreach (var date in dates) { var point = new Tuple<Field, Symbol, DateTime>(field, symbol, date); var cachedValue = GetCachedData(point); if (cachedValue.HasValue) { result.Add(new Tuple<Field, Symbol, DateTime, SomeType>(field, symbol, date, cachedValue.Value); continue; } var serviceValue = GetDataFromService(point); CacheData(point, serviceValue); result.Add(new Tuple<Field, Symbol, DateTime, SomeType>(field, symbol, date, serviceValue); } return result.ToArray(); } 
0
source

All Articles