Storing Large Variable Column SQL Datasets

In America’s Cup yachting, we generate large data sets, where at each time stamp (for example, 100 Hz) we need to store, possibly, 100-1000 sensor data channels (for example, speed, loads, pressures). We save this in MS SQL Server and should be able to retrieve subsets of data channels for analysis and execute queries, such as maximum pressure on a particular sensor in the test or throughout the season.

The set of saved channels remains unchanged for several thousand time stamps, but will change day by day as new sensors are added, renamed, etc., and depending on testing, racing or modeling, the number of channels can vary greatly.

The tutorial, a way to structure SQL tables, is likely to be as follows:

OPTION 1

ChannelNames
+-----------+-------------+
| ChannelID | ChannelName |
+-----------+-------------+
| 50        | Pressure    |
| 51        | Speed       |
| ...       | ...         |
+-----------+-------------+

Sessions
+-----------+---------------+-------+----------+
| SessionID |   Location    | Boat  | Helmsman |
+-----------+---------------+-------+----------+
| 789       | San Francisco | BoatA |  SailorA |
| 790       | San Francisco | BoatB |  SailorB |
| ...       | ...           | ...   |          |
+-----------+---------------+-------+----------+

SessionTimestamps
+-------------+-------------+------------------------+
| SessionID   | TimestampID | DateTime               |
+-------------+-------------+------------------------+
| 789         |       12345 | 2013/08/17 10:30:00:00 |
| 789         |       12346 | 2013/08/17 10:30:00:01 |
| ...         |       ...   | ...                    |
+-------------+-------------+------------------------+

ChannelData
+-------------+-----------+-----------+
| TimestampID | ChannelID | DataValue |
+-------------+-----------+-----------+
| 12345       | 50        | 1015.23   |
| 12345       | 51        | 12.23     |
| ...         | ...       | ...       |
+-------------+-----------+-----------+

This structure is neat but inefficient. For each DataValue, three storage fields are required, and on each time stamp we need to INSERT 100-1000 lines.

If we always had the same channels, it would be more reasonable to use one line per timestamp and structure as follows:

OPTION 2

+-----------+------------------------+----------+-------+----------+--------+-----+
| SessionID | DateTime               | Pressure | Speed | LoadPt   | LoadSb | ... |
+-----------+------------------------+----------+-------+----------+--------+-----+
| 789       | 2013/08/17 10:30:00:00 | 1015.23  | 12.23 | 101.12   | 98.23  | ... |
| 789       | 2013/08/17 10:30:00:01 | 1012.51  | 12.44 | 100.33   | 96.82  | ... |
| ...       | ...                    | ...      |       |          |        |     |
+-----------+------------------------+----------+-------+----------+--------+-----+

However, the channels change every day, and over the course of months, the number of columns will grow and grow, with most cells ending up empty. We could create a new table for each new session, but it makes no sense to use the table name as a variable and ultimately lead to tens of thousands of tables. It is also very difficult to request data stored in several tables during the season.

Another variant:

OPTION 3

+-----------+------------------------+----------+----------+----------+----------+-----+
| SessionID | DateTime               | Channel1 | Channel2 | Channel3 | Channel4 | ... |
+-----------+------------------------+----------+----------+----------+----------+-----+
| 789       | 2013/08/17 10:30:00:00 | 1015.23  |    12.23 | 101.12   | 98.23    | ... |
| 789       | 2013/08/17 10:30:00:01 | 1012.51  |    12.44 | 100.33   | 96.82    | ... |
| ...       | ...                    | ...      |          |          |          |     |
+-----------+------------------------+----------+----------+----------+----------+-----+

- , EXEC eval , SQL . , , , , , - . SPARSE , EXEC/eval .

, ?

+4
2

1.

, ( ) - .

NULL , . .

, , - 1024, , 1000 /, . , 30 000 , - 8,060 . .

, , 8060 , 30 000.

100 - 1000 1 1 . INSERT , 1000 INSERT , . :

1) INSERT

INSERT INTO ChannelData (TimestampID, ChannelID, DataValue) VALUES
(12345, 50, 1015.23),
(12345, 51, 12.23),
...
(), (), (), (), ........... ();

1000 INSERT , 1000 ( ).

2) , table-valued. ​​, 1000 .

CREATE TYPE [dbo].[ChannelDataTableType] AS TABLE(
    [TimestampID] [int] NOT NULL,
    [ChannelID] [int] NOT NULL,
    [DataValue] [float] NOT NULL
)
GO

CREATE PROCEDURE [dbo].[InsertChannelData]
    -- Add the parameters for the stored procedure here
    @ParamRows dbo.ChannelDataTableType READONLY
AS
BEGIN
    -- SET NOCOUNT ON added to prevent extra result sets from
    -- interfering with SELECT statements.
    SET NOCOUNT ON;

    BEGIN TRANSACTION;
    BEGIN TRY

        INSERT INTO [dbo].[ChannelData]
            ([TimestampID],
            [ChannelID],
            [DataValue])
        SELECT
            TT.[TimestampID]
            ,TT.[ChannelID]
            ,TT.[DataValue]
        FROM
            @ParamRows AS TT
        ;

        COMMIT TRANSACTION;
    END TRY
    BEGIN CATCH
        ROLLBACK TRANSACTION;
    END CATCH;

END
GO

, , . . 10K, .

, 100 , CSV () , , , , , .

, , , , , , , . ChannelData InterestingChannelData. ChannelData , . InterestingChannelData . , . (/ ), .

+2

:

, "" "". , , . (, , )

. . ?

+1

All Articles