Skip to content

Commit e3d10c8

Browse files
Document and support strongly typed columns in CSV reader
Expanded the README with detailed documentation and examples for defining strongly typed columns, using built-in and custom type converters, and combining with schema inference. Updated CsvSchemaInference methods to require List<InferredColumn> for improved type safety and consistency.
1 parent f24de66 commit e3d10c8

File tree

2 files changed

+69
-2
lines changed

2 files changed

+69
-2
lines changed

project/Dataplat.Dbatools.Csv/README.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ Install-Package Dataplat.Dbatools.Csv
3030

3131
- **Streaming IDataReader** - Works seamlessly with SqlBulkCopy and other ADO.NET consumers
3232
- **Schema Inference** - Analyze CSV data to determine optimal SQL Server column types
33+
- **Strongly Typed Columns** - Define column types for automatic conversion with built-in and custom converters
3334
- **High Performance** - ~1.5x faster than LumenWorks/CsvHelper with ArrayPool-based memory management
3435
- **Parallel Processing** - Optional multi-threaded parsing for large files (25K+ rows/sec)
3536
- **String Interning** - Reduce memory for files with repeated values
@@ -331,6 +332,72 @@ using var reader = new CsvDataReader("data.csv", options);
331332
| `TotalCount` | long | Total rows analyzed |
332333
| `NonNullCount` | long | Rows with non-null values |
333334

335+
### Strongly Typed Columns
336+
337+
Define column types explicitly for automatic conversion during reading:
338+
339+
```csharp
340+
var options = new CsvReaderOptions
341+
{
342+
ColumnTypes = new Dictionary<string, Type>
343+
{
344+
["Id"] = typeof(int),
345+
["Price"] = typeof(decimal),
346+
["IsActive"] = typeof(bool),
347+
["Created"] = typeof(DateTime),
348+
["UniqueId"] = typeof(Guid)
349+
}
350+
};
351+
352+
using var reader = new CsvDataReader("data.csv", options);
353+
while (reader.Read())
354+
{
355+
int id = reader.GetInt32(0); // Already converted from string
356+
decimal price = reader.GetDecimal(1); // Culture-aware parsing
357+
bool active = reader.GetBoolean(2); // Handles true/false/yes/no/1/0
358+
DateTime created = reader.GetDateTime(3);
359+
Guid guid = reader.GetGuid(4);
360+
}
361+
```
362+
363+
**Built-in type converters:** `Guid`, `bool`, `DateTime`, `short`, `int`, `long`, `float`, `double`, `decimal`, `byte`, `string`
364+
365+
**Combine with schema inference:**
366+
367+
```csharp
368+
// Infer types from CSV data, then use them for reading
369+
var columns = CsvSchemaInference.InferSchemaFromSample("data.csv");
370+
var typeMap = CsvSchemaInference.ToColumnTypes(columns);
371+
372+
var options = new CsvReaderOptions { ColumnTypes = typeMap };
373+
using var reader = new CsvDataReader("data.csv", options);
374+
```
375+
376+
**Custom type converters:**
377+
378+
```csharp
379+
using Dataplat.Dbatools.Csv.TypeConverters;
380+
381+
// Create a custom converter for enums or custom types
382+
public class StatusConverter : TypeConverterBase<OrderStatus>
383+
{
384+
public override bool TryConvert(string value, out OrderStatus result)
385+
{
386+
return Enum.TryParse(value, true, out result);
387+
}
388+
}
389+
390+
// Register and use
391+
var registry = TypeConverterRegistry.Default;
392+
registry.Register(new StatusConverter());
393+
394+
var options = new CsvReaderOptions
395+
{
396+
TypeConverterRegistry = registry,
397+
ColumnTypes = new Dictionary<string, Type> { ["Status"] = typeof(OrderStatus) }
398+
};
399+
```
400+
334401
### Null vs Empty String Handling
335402

336403
CSV files can represent missing data in two ways: an empty field (`,,`) or an explicitly quoted empty string (`,"",...`). The `DistinguishEmptyFromNull` option controls how these are interpreted.

project/dbatools/Csv/Reader/CsvSchemaInference.cs

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -375,7 +375,7 @@ private static Action<long, long> WrapProgressCallback(Action<double> userCallba
375375
/// <param name="tableName">The name of the table to create.</param>
376376
/// <param name="schemaName">Optional schema name (default: dbo).</param>
377377
/// <returns>A CREATE TABLE SQL statement.</returns>
378-
public static string GenerateCreateTableStatement(IEnumerable<InferredColumn> columns, string tableName, string schemaName = "dbo")
378+
public static string GenerateCreateTableStatement(List<InferredColumn> columns, string tableName, string schemaName = "dbo")
379379
{
380380
if (columns == null)
381381
throw new ArgumentNullException(nameof(columns));
@@ -409,7 +409,7 @@ public static string GenerateCreateTableStatement(IEnumerable<InferredColumn> co
409409
/// </summary>
410410
/// <param name="columns">The inferred column definitions.</param>
411411
/// <returns>A dictionary mapping column names to .NET types.</returns>
412-
public static Dictionary<string, Type> ToColumnTypes(IEnumerable<InferredColumn> columns)
412+
public static Dictionary<string, Type> ToColumnTypes(List<InferredColumn> columns)
413413
{
414414
if (columns == null)
415415
throw new ArgumentNullException(nameof(columns));

0 commit comments

Comments
 (0)