I have several XML files containing data for a research project in which I need to run some statistics. The amount of data is close to 100 GB.
The structure is not so complex (it can be compared, perhaps, with 10 tables in the relational model), and given the nature of the problem, this data will never be updated again, I only need this in a place where it is easy to run queries.
I read about XML databases and the ability to run queries in XPATH style, but I never used them, and itβs not very convenient for me. Having data in a relational database will be my preferred choice.
So, I'm looking for a way to hide data stored in XML in a relational database (think of a large .sql file, similar to the one that mysqldump generated, but will do something else). The ultimate goal is the ability to run SQL queries to crunch data.
After some research, I'm pretty sure I have to write this myself. But I believe that this is a common problem, and therefore there should be a tool that already does this.
So, do you know about any tool that converts XML data into a relational database?
PS1:
My idea would be something like (it might work differently, but just to make sure you understand my point):
- Data structure analysis (based on XML or XSD themselves)
- Create a relational database (tables, keys) based on this structure
- Creating SQL statements to create a database
- Generate SQL statements to create data fill
PS2:
I saw a few posts here in SO, but still I could not find a solution. Microsoft Xml Bulk Load "seems to be doing something in this direction, but I don't have MS SQL Server.
source share