DataTable, which has an identifier from other data, you need to convert this datatable to replace identifiers with their data name column value

I have N datatables, where N-1 datatables represent some entities and 1 represent a relationship between these objects.

as a country of essence

Country DATATABLE ID | Country Name | Country Code ------------------------------------ ID1 | USA | USA ID2 | INDIA | IND ID3 | CHINA | CHI 

Entity Content

 Continent DATATABLE ID | Continent Name | Continent Code ------------------------------------ IDC1 | NORTH AMERICA | NA IDC2 | SOUTH AMERICA | SA IDC3 | ASIA | AS 

Entity

 Company DATATABLE ID | Company Name | Company Code ------------------------------------ CM1 | XYZ Company | XYZ CM2 | Fun Company | Fun CM3 | ABC Company | ABC 

The connection between them.

 Company_Country_Continent_Relationship DataTable ID | Company | Country | Continent | Some Value1 | Some Value 2 ------------------------------------------------------------------------------------- R1 | CM1 | ID1 | IDC1 | 100 | 150 R2 | CM2 | ID2 | IDC3 | 200 | 200 R3 | CM3 | ID1 | IDC1 | 150 | 250 R4 | CM1 | ID3 | IDC3 | 100 | 150 R5 | CM2 | ID1 | IDC1 | 200 | 200 R6 | CM3 | ID2 | IDC3 | 150 | 250 R7 | CM1 | ID2 | IDC3 | 100 | 150 R8 | CM2 | ID3 | IDC3 | 200 | 200 R9 | CM3 | ID3 | IDC3 | 150 | 250 

Now I need to create another relationship table, in which there will be instead of "Identifier" instead of "Name". In this example, the relationship data stores the identifier for the company, country and continent, now I want to convert this id value to where there is a name ie or CM1 - company XYZ.

I use the TramnsformRelationshipData method for this conversion, and it works correctly.

  public static DataTable TramnsformRelationshipData(DataTable relationshipData, Dictionary<string, DataTable> mapping) { DataTable transformedDataTable = null; if (relationshipData == null || mapping == null ) return null; transformedDataTable = relationshipData.Copy(); foreach (DataColumn item in relationshipData.Columns) { if (mapping.ContainsKey(item.ColumnName)) { var instanceData = mapping[item.ColumnName]; if (instanceData == null) return null; foreach (DataRow row in transformedDataTable.Rows) { var filteredRows = instanceData.Select("ID = '" + row[item.ColumnName] + "'"); if (filteredRows.Any()) row[item.ColumnName] = filteredRows[0][1]; } } } return transformedDataTable; } 

But this method iterates through all the data and very slowly when the data relations contain more objects to convert. So, how can I optimize this code to work with a lot of data types with a lot of rows.

Edited . In most cases, this data is not stored in the database, it is in memory, and in memory the amount of this data can be increased or decreased.

Thanks.

+4
source share
3 answers

The solution here is to create a collection based on a hash (e.g. hashtables, dictionary, lookups in .NET) while the identifier column is the key and uses this instead of .Select (Id = x)

The code may look something like this: untested.

 public static DataTable TramnsformRelationshipData(DataTable relationshipData, Dictionary<string, DataTable> mapping) { Dictionary<string,Dictionary<string,DataRow>> newMappings = new Dictionary<string,Dictionary<string,DataRow>>(); foreach (var kvp in mapping) { newMappings.Add(kvp.Key,kvp.Value.Rows.Cast<DataRow>().ToDictionary(dr=>dr["ID"] as string)); } DataTable transformedDataTable = null; if (relationshipData == null || mapping == null ) return null; transformedDataTable = relationshipData.Copy(); foreach (DataColumn item in relationshipData.Columns) { if (newMapping.ContainsKey(item.ColumnName)) { var instanceData = newMapping[item.ColumnName]; if (instanceData == null) return null; foreach (DataRow row in transformedDataTable.Rows) { // var filteredRows = instanceData.Select("ID = '" + row[item.ColumnName] + "'"); // if (filteredRows.Any()) row[item.ColumnName] = instanceData[row[item.ColumnName]][1]; } } } return transformedDataTable; } 
+2
source

Have you considered creating SQL code (which will be much faster than using C # code) with the SELECT INTO ? I usually prefer to use SQL when I need to work with a large amount of data.

An example taken from this MSDN page .

 SELECT c.FirstName, c.LastName, e.JobTitle, a.AddressLine1, a.City, sp.Name AS [State/Province], a.PostalCode INTO dbo.EmployeeAddresses FROM Person.Person AS c JOIN HumanResources.Employee AS e ON e.BusinessEntityID = c.BusinessEntityID JOIN Person.BusinessEntityAddress AS bea ON e.BusinessEntityID = bea.BusinessEntityID JOIN Person.Address AS a ON bea.AddressID = a.AddressID JOIN Person.StateProvince as sp ON sp.StateProvinceID = a.StateProvinceID; 

First write a SELECT to get your data, and then add an INTO statement.

Or you can specify INSERT and SELECT , where you can specify a list of columns to insert data. Example from MSDN .

 INSERT INTO Production.ZeroInventory (DeletedProductID, RemovedOnDate) SELECT ProductID, GETDATE() FROM ... 
+3
source

It seems to me that the problem itself is very simple, 3 joins can fix it if you use sql.
I assume the source is not in sql (if it is me, I would recommend creating a view there for maximum performance).
If you need to use datasets, you can use linq to simulate a connection.

Check out the link on how to use linq with datasets.
You can learn how to make a connection here.

The end result will look something like this:

 var q = from r in relations.AsEnumerable() join c in countries.AsEnumerable() on r.Country equals c.Id join con in continents.AsEnumerable() on r.Continent equals con.Id select new { someval = r.someValue1, someval2 = r.someValue2, countryname = c.Name continent = con.Name}; 
+2
source

All Articles