IDataReader and "HasColumn", the best approach?

I saw two general approaches for checking for a column in IDataReader:

public bool HasColumn(IDataReader reader, string columnName) { try { reader.getOrdinal(columnName) return true; } catch { return false; } } 

Or:

 public bool HasColumn(IDataReader reader, string columnName) { reader.GetSchemaTable() .DefaultView.RowFilter = "ColumnName='" + columnName + "'"; return (reader.GetSchemaTable().DefaultView.Count > 0); } 

Personally, I used the second, because I hate using exceptions for this reason.

However, on a large dataset, I believe that RowFilter may have to scan a table in a column, and this can be incredibly slow.

Thoughts?

+6
source share
3 answers

I think I have a reasonable answer to this old stone.

I would go with the first approach, because it is much simpler. If you want to avoid an exception, you can cache field names and use TryGet in the cache.

 public Dictionary<string,int> CacheFields(IDataReader reader) { var cache = new Dictionary<string,int>(); for (int i = 0; i < reader.FieldCount; i++) { cache[reader.GetName(i)] = i; } return cache; } 

The surface of this approach is that it is simpler and gives you better control. Also, note: you may need to study case insensitivity or fuzzy comparisons with the channel, which would make the material a little more complicated.

+5
source share

A lot depends on how you use HasColumn. Do you call it only once or twice or several times in a loop? Perhaps the column is there or unknown in advance?

Setting the row filter is likely to scan the table every time. (Also theoretically, GetSchemaTable () can generate a completely new table every time it is called, which would be even more expensive - I donโ€™t think SqlDataReader does it, but at the IDataReader level, who knows?) But if you only call it once or twice. I cannot imagine that this is a big part of the problem (unless you have thousands of columns or something else).

(I would, however, at least save the result of GetSchemaTable () in a local var inside the method, to avoid double-calling in quick succession, if you did not cache it somewhere by accident, that your specific IDataReader will regenerate it.)

If you know in advance that under normal circumstances the column that you request will be present, the exclusion method will be a little more acceptable (because a column that does not exist is essentially an exceptional case). Even if it is not, it may work a little better, but then again, if you do not name it repeatedly, you should ask yourself if performance really is of great concern.

And if you name it many times, you probably have to consider a different approach differently, for example: call GetSchemaTable () after the start, skip the table and load the field names into the dictionary or some other structure that is designed for quick search.

+1
source share

I would not worry about the impact of performance. Even if you had a table with 1000 columns (which would be a huge table), you still perform a โ€œtable scanโ€ of 1000 rows. This is likely to be trivial.

Premature optimization will simply lead you to an overly complex implementation. Implement the version that suits you best, and then measure the impact of performance. If this is unacceptable compared to your performance requirements, consider alternatives.

0
source share

All Articles