Skip to content

Commit 1693886

Browse files
Clarify CSV library performance tradeoffs in README
Expanded README to explain why Dataplat's IDataReader implementation can't match the raw parsing speed of Sep/Sylvan due to architectural constraints. Added a comparison table highlighting scenarios where each library excels and clarified Dataplat's advantages for database import workflows.
1 parent ada074a commit 1693886

File tree

1 file changed

+15
-2
lines changed

1 file changed

+15
-2
lines changed

project/Dataplat.Dbatools.Csv/README.md

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -65,9 +65,22 @@ Benchmark: 100,000 rows × 10 columns (.NET 8, AVX-512)
6565
| CsvHelper | 101 ms | 1.4x slower |
6666
| LumenWorks | 100 ms | 1.4x slower |
6767

68-
**Why choose Dataplat?** Sep and Sylvan are faster for raw parsing, but Dataplat provides the complete database workflow: native IDataReader for SqlBulkCopy, built-in compression (no need to extract `.csv.gz` files), progress reporting, and lenient parsing for messy enterprise exports.
68+
### Understanding the performance tradeoffs
6969

70-
For `file.csv.gz → SqlBulkCopy → SQL Server` workflows, the complete Dataplat pipeline may outperform combining Sep + manual decompression + custom IDataReader wrapper.
70+
Sep achieves 21 GB/s by using `Span<T>` and only materializing strings when explicitly requested. Sylvan uses similar techniques. Both avoid allocations until the last possible moment.
71+
72+
**Why Dataplat can't match this:** The `IDataReader` interface requires `GetValue()` to return actual `object` instances. For string columns, this means creating real `string` objects—we can't return spans. This is a fundamental architectural tradeoff for SqlBulkCopy compatibility.
73+
74+
**When each library shines:**
75+
76+
| Scenario | Bottleneck | Winner |
77+
|----------|-----------|--------|
78+
| CSV → SqlBulkCopy → SQL Server | Network/disk I/O, not parsing | Dataplat (integrated) |
79+
| CSV.gz → SQL Server | Decompression overhead | Dataplat (built-in) |
80+
| Messy enterprise exports | Error handling complexity | Dataplat (lenient mode) |
81+
| Raw in-memory parsing benchmark | CPU/allocations | Sep/Sylvan |
82+
83+
For database import workflows, the complete `file.csv.gz → SqlBulkCopy → SQL Server` pipeline with Dataplat is often comparable to combining Sep + manual decompression + custom IDataReader wrapper, while requiring less code.
7184

7285
## Quick Start
7386

0 commit comments

Comments
 (0)