How to combine data from several databases?

BACKGROUND:

I should preface this by saying that I am not trying to get someone to do their job for me. I feel like I'm at a crossroads where there are several ways to get to my destination, but I'm not sure which ones are “standard” and / or if my relatively limited knowledge is missing.

I have a system that has been developing for half a year, and since January 11, the database schema has been pretty stable. (I was never sure that I was making a serious mistake by creating a database for each month to correspond to monthly accounting cycles, but I simply did not have the know-how to do it differently)

NOW:

My boss asks me to create reports for the year to date, consisting of records from all monthly databases.

WHAT I GET STARTED:

I put together a metadata schema and populated it with enough information to write an application for ETL operations.

Here's what it looks like:

USE [DAMain1] GO CREATE TABLE AccountingPeriod ( Id INT PRIMARY KEY NOT NULL, Name VARCHAR(255) NOT NULL UNIQUE, DateStart DATE NOT NULL, DateStop DATE NOT NULL ) GO INSERT INTO AccountingPeriod VALUES (1, 'Jan11', '1/1/2011', '1/31/2011') ,(2, 'Feb11', '2/1/2011', '2/28/2011') ,(3, 'Mar11', '3/1/2011', '3/31/2011') ,(4, 'Apr11', '4/1/2011', '4/30/2011') ,(5, 'May11', '5/1/2011', '5/31/2011') CREATE TABLE [DBServer] ( Id INT PRIMARY KEY NOT NULL, Name VARCHAR(255) NOT NULL UNIQUE ) GO INSERT INTO DBServer VALUES (1, 'Aaron.directagents.local') GO CREATE TABLE [DBInstance] ( Id INT PRIMARY KEY NOT NULL ,DBServerId int NOT NULL REFERENCES DBServer(Id) ,SchemaName VARCHAR(255) NOT NULL ,CatalogName VARCHAR(255) NOT NULL ,ConnectionString VARCHAR(2000) NOT NULL ) GO INSERT INTO DBInstance VALUES (1, 1, 'dbo', 'DADatabaseR2', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseR2;Integrated Security=True') ,(2, 1, 'dbo', 'DADatabaseR3', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseR3;Integrated Security=True') ,(3, 1, 'dbo', 'DADatabaseMarch11', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseMarch11;Integrated Security=True') ,(4, 1, 'dbo', 'DADatabaseApr11', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseApr11;Integrated Security=True') GO CREATE TABLE DADB ( Id int PRIMARY KEY NOT NULL, Name VARCHAR(255) NOT NULL UNIQUE, AccountingPeriodId int NOT NULL REFERENCES AccountingPeriod(Id), DBInstanceId INT NOT NULL REFERENCES DbInstance(Id) ) GO INSERT INTO DADB VALUES (1, 'Direct Agents Database for January 2011', 1, 1) ,(2, 'Direct Agents Database for February 2011', 2, 2) ,(3, 'Direct Agents Database for March 2011', 3, 3) ,(4, 'Direct Agents Database for April 2011', 4, 4) GO CREATE VIEW DADBs AS SELECT DA.Name [Database] ,AP.Name [Accounting Period] ,AP.DateStart [Start] ,AP.DateStop [Stop] ,DS.Name [Server] ,DI.SchemaName ,DI.CatalogName ,DI.ConnectionString [Connection] FROM DADB DA INNER JOIN AccountingPeriod AP ON DA.AccountingPeriodId=AP.Id INNER JOIN DBInstance DI ON DA.DBInstanceId=DI.Id INNER JOIN DBServer DS ON DI.DBServerId=DS.Id GO SELECT * FROM DADBs GO 

PROBLEM:

I don't know if this is a reasonable / normal way to get around this. I have enough time to grow on one, but I can not independently determine which way to go.

QUESTION: Given that I need to pull position data and aggregate across multiple databases, as I explained, are there alternatives to defining metadata tables that manage custom ETL solutions? (for my purposes, the C # application and the SSIS project are eqiv, but I am interested to know if Analysis Services or Reporting Services can be used here)

+4
source share
1 answer

Bad database designs often go beyond reporting. As you have discovered, having data for each month in separate databases has created a nightmare. Imagine what happens if the accounting cycle dates change? The best solution would be to combine the data into a single database, in which you define the attributes of the records based on the accounting cycle (entered date, date, etc.).

At the same time, given what you have, I would say that the best solution is to create a consolidated database and populate it using SSIS from other databases until you can update the middle tiers or user interfaces to use a consolidated design .

+4
source

All Articles