Background
I know that in Oracle, you can create custom aggregation functions that process a collection of values ​​and return a single result. Change I even read a friendly guide at docs.oracle.com/cd/B28359_01/appdev.111/b28425/aggr_functions.htm!
I also know that Oracle provides built-in analytic functions , such as DENSE_RANK and RATIO_TO_REPORT , which provide values ​​for each input, relative to the collection / window of values ​​whose input lies inside.
Problem
I want to know if there is a way to create my own analytic function , presumably similar to how I can create my own aggregation function and, in particular, create one with additional arguments in my custom analytic function.
Thin terminology clause
When I refer to the "analytic function", read it as a function that, in addition to accepting window parameters using the PARTITION keyword, can also return different values ​​to a given window. (If anyone has a better term for this, please let me know! Pure analytic function? DENSE_RANK classic analytic function? Non-aggregate analytic function?)
The Oracle documentation notes that an aggregate function can be used as an analytic (window) function. Unfortunately, this means that the PARTITION keyword for specifying a window in analytic functions can also be applied to aggregate functions. This does not contribute to the combined function for my desired status of the ability to return different values ​​within a fixed window.
Unit used as analytical:
SELECT SUM(income) OVER (PARTITION BY first_initial) AS total FROM data;
will have as many records as data , but will only have as many different total as there are first initials.
Analytical analysis is used as analytical:
SELECT RATIO_TO_REPORT(income) OVER (PARTITION BY first_initial) AS ratio FROM data;
will have as many records as data , AND, even within this first_initial section, those first_initial may be different.
Context
I was granted access only by calling the PL / SQL procedure, which takes a numerical collection as an IN OUT parameter and has several other IN configuration parameters. The procedure changes the values ​​of the collection (think of it as an “Authorized and Required Quality Procedure for the University”), depending on the configuration parameters.
Currently, the process of using this procedure is to hard-code the cursor loop, which detects a change from one section of data to another, and then inside each section, extracts data into a collection, which is then passed to the procedure, changes and, ultimately, dumped back to a separate table. I planned to improve this by creating a PIPELINED PARALLEL_ENABLE table that encapsulates some logic, but I would prefer to include queries such as:
SELECT G.Course_ID , G.Student_ID , G.Raw_Grade , analytic_wrapper(G.raw_grade, P.course_config_data) OVER (PARTITION BY G.Course_ID) AS Adjusted_Grade , P.course_config_data FROM grades G LEFT JOIN policies P ON G.Course_ID = P.Course_ID;
This requires the ability to create a custom analytic function, and because the procedure requires different inputs on different sections (for example, Course_ID specific P.course_config_data above), it must also accept not only the argument associated with the data, but also additional entrances.
Is this possible, and if so, where can I find the documentation? My google-fu didn't help me.
Extra wrinkle
The PL / SQL procedure that I provided is (efficiently) non-deterministic, and its result has statistical properties that need to be preserved. For example, if A={A[0], A[1], A[3]} are raw estimates for one particular class, and B=f(A) is the result of calling the procedure on A at 1:00, and C=f(A) is the result of calling the procedure on A at 1:15, then B={B[0],B[1],B[2]} and C={C[0],C[1],C[2]} are acceptable outputs for use, but a mixture of elements like {C[0],B[1],C[2]} not acceptable.
As a result of this, the procedure must be called exactly once on each section. (Well, technically, it can be wastefully called as many times as needed, but all results for the section should come from the same call).
Suppose, for example, that the procedure I provided works as follows: it takes a collection of ratings as an IN OUT parameter, and then sets one of those classes selected at random to 100. All other classes are set to zero. Doing this at 13:00 can lead to Alice having only a passing class, and when starting at 13:01, Bob can only have a passing class. Despite this, it must be that exactly one student per class passes, no more and no less.