VBA-CSV

Fast and convenient CSV reading and writing for VBA (Excel, Word, PowerPoint) and Excel spreadsheets, inspired by R's fread, and Julia's CSV.jl.

If you like this code please give it a star. I like stars.

Installation

Download modCSVReadWrite.bas from the latest release.
Import modCSVReadWrite.bas into your project (Open VBA Editor, Alt + F11; File > Import File).
Add three references (In VBA Editor Tools > References)
- Microsoft Scripting Runtime
- Microsoft VBScript Regular Expressions 5.5
- Microsoft ActiveX Data Objects 6.1 Library
If you plan to call the functions from worksheet formulas then you might like to tell Excel's Function Wizard about them by adding calls to RegisterCSVRead and RegisterCSVWrite to the project's Workbook_Open event, which lives in the ThisWorkbook class module.

Private Sub Workbook_Open()
    RegisterCSVRead
    RegisterCSVWrite
End Sub

Alternatively (or in addition), install Excel-DNA Intellisense and copy the worksheet _Intellisense_ from VBA-CSV-Intellisense.xlsx (in the latest release) into the same workbook into which you copied modCSVReadWrite.

Acknowledgements

I re-worked the parsing code of CSVRead after examining sdkn104's code available here; my approach is now similar to the one employed there, and the performance is similar too.

The documentation borrows freely from that of Julia's CSV.jl, though sadly VBA is not capable of Julia's extremely high performance. For testing CSVRead, I also use many of the test files that the authors of CSV.jl have created here.

Examples

Read from URL

Rdatasets provides 1,752 csv files including one containing names of passengers on the Titanic, their sex, age and passenger class, and whether or not they survived the sinking. In this file, missing data is indicated by the two characters NA.

To see the data in a worksheet, enter this formula¹: =CSVRead("https://vincentarelbundock.github.io/Rdatasets/csv/carData/TitanicSurvival.csv",TRUE,,,,,,,,,,,,,"NA",NA())

To load the data into an array in VBA:

Sub Demo()
    Dim TitanicData
    TitanicData = CSVRead(FileName:="https://vincentarelbundock.github.io" & _
                  "/Rdatasets/csv/carData/TitanicSurvival.csv", _
                  ConvertTypes:=True, MissingStrings:="NA")
End Sub

Write and Read "round-tripping"

In the GIF below, strings, numbers, dates and Booleans in the input Data range are written to file with the formula =CSVWrite(A2:D8,"C:\Temp\Test.csv") at cell F2, and the file is read back with the formula =CSVRead(F2,TRUE) at cell J2, so that the original types of the fields are recovered. The formula =CSVRead(F2,,FALSE) at H2 demonstrates how CSVRead can display the "raw" contents of a text file by passing Delimiter as FALSE. Refresh your browser (F5) to replay the GIF.

Footnote 1: Assumes you're using Excel 365 or Excel 2021 with dynamic array formulas. If you're using an older version of Excel, this page explains how to enter a formula that returns an array.

Documentation

CSVRead

Returns the contents of a comma-separated file on disk as an array.

Public Function CSVRead(ByVal FileName As String, Optional ByVal ConvertTypes As Variant = False, _
          Optional ByVal Delimiter As Variant, Optional ByVal IgnoreRepeated As Boolean, _
          Optional ByVal DateFormat As String = "Y-M-D", Optional ByVal Comment As String, _
          Optional ByVal IgnoreEmptyLines As Boolean, Optional ByVal HeaderRowNum As Long, _
          Optional ByVal SkipToRow As Long, Optional ByVal SkipToCol As Variant = 1, _
          Optional ByVal NumRows As Long, Optional ByVal NumCols As Variant = 0, _
          Optional ByVal TrueStrings As Variant, Optional ByVal FalseStrings As Variant, _
          Optional ByVal MissingStrings As Variant, Optional ByVal ShowMissingsAs As Variant, _
          Optional ByVal Encoding As Variant, Optional ByVal DecimalSeparator As String, _
          Optional ByRef HeaderRow As Variant) As Variant

Argument	Description
`FileName`	The full name of the file, including the path, or else a URL of a file, or else a string in CSV format.
`ConvertTypes`	Controls whether fields in the file are converted to typed values or remain as strings, and sets the treatment of "quoted fields" and space characters. `ConvertTypes` should be a string of zero or more letters from allowed characters `NDBETQK`. The most commonly useful letters are: 1) `N` number fields are returned as numbers (Doubles). 2) `D` date fields (that respect `DateFormat`) are returned as Dates. 3) `B` fields matching `TrueStrings` or `FalseStrings` are returned as Booleans. `ConvertTypes` is optional and defaults to the null string for no type conversion. `TRUE` is equivalent to `NDB` and `FALSE` to the null string. Four further options are available: 4) `E` fields that match Excel errors are converted to error values. There are fourteen of these, including `#N/A`, `#NAME?`, `#VALUE!` and `#DIV/0!`. 5) `T` leading and trailing spaces are trimmed from fields. In the case of quoted fields, this will not remove spaces between the quotes. 6) `Q` conversion happens for both quoted and unquoted fields; otherwise only unquoted fields are converted. 7) `K` quoted fields are returned with their quotes kept in place. For most files, correct type conversion can be achieved with `ConvertTypes` as a string which applies for all columns, but type conversion can also be specified on a per-column basis. Enter an array (or range) with two columns or two rows, column numbers on the left/top and type conversion (subset of `NDBETQK`) on the right/bottom. Instead of column numbers, you can enter strings matching the contents of the header row, and a column number of zero applies to all columns not otherwise referenced. For convenience when calling from VBA, you can pass an array of two element arrays such as `Array(Array(0,"N"),Array(3,""),Array("Phone",""))` to convert all numbers in a file into numbers in the return except for those in column 3 and in the column(s) headed "Phone".
`Delimiter`	By default, `CSVRead` will try to detect a file's delimiter as the first instance of comma, tab, semi-colon, colon or pipe found in the first 10,000 characters of the file, searching only outside of quoted regions and outside of date-with-time fields (since these contain colons). If it can't auto-detect the delimiter, it will assume comma. If your file includes a different character or string delimiter you should pass that as the `Delimiter` argument. Alternatively, enter `FALSE` as the delimiter to treat the file as "not a delimited file". In this case the return will mimic how the file would appear in a text editor such as NotePad. The file will be split into lines at all line breaks (irrespective of double quotes) and each element of the return will be a line of the file.
`IgnoreRepeated`	Whether delimiters which appear at the start of a line, the end of a line or immediately after another delimiter should be ignored while parsing; useful for fixed-width files with delimiter padding between fields.
`DateFormat`	The format of dates in the file such as `Y-M-D` (the default), `M-D-Y` or `Y/M/D`. Also `ISO` for ISO8601 (e.g., 2021-08-26T09:11:30) or `ISOZ` (time zone given e.g. 2021-08-26T13:11:30+05:00), in which case dates-with-time are returned in UTC time.
`Comment`	Rows that start with this string will be skipped while parsing.
`IgnoreEmptyLines`	Whether empty rows/lines in the file should be skipped while parsing (if `FALSE`, each column will be assigned `ShowMissingsAs` for that empty row).
`HeaderRowNum`	The row in the file containing headers. Type conversion is not applied to fields in the header row, though leading and trailing spaces are trimmed. This argument is most useful when calling from VBA, with `SkipToRow` set to one more than `HeaderRowNum`. In that case the function returns the rows starting from `SkipToRow`, and the header row is returned via the by-reference argument `HeaderRow`. Optional and defaults to 0.
`SkipToRow`	The first row in the file that's included in the return. Optional and defaults to one more than `HeaderRowNum`.
`SkipToCol`	The column in the file at which reading starts, as a number or a string matching one of the file's headers. Optional and defaults to 1 to read from the first column.
`NumRows`	The number of rows to read from the file. If omitted (or zero), all rows from `SkipToRow` to the end of the file are read.
`NumCols`	If a number, sets the number of columns to read from the file. If a string matching one of the file's headers, sets the last column to be read. If omitted (or zero), all columns from `SkipToCol` are read.
`TrueStrings`	Indicates how `TRUE` values are represented in the file. May be a string, an array of strings or a range containing strings; by default, `TRUE`, `True` and `true` are recognised.
`FalseStrings`	Indicates how `FALSE` values are represented in the file. May be a string, an array of strings or a range containing strings; by default, `FALSE`, `False` and `false` are recognised.
`MissingStrings`	Indicates how missing values are represented in the file. May be a string, an array of strings or a range containing strings. By default, only an empty field (consecutive delimiters) is considered missing.
`ShowMissingsAs`	Fields which are missing in the file (consecutive delimiters) or match one of the `MissingStrings` are returned in the array as `ShowMissingsAs`. Defaults to Empty, but the null string or `#N/A!` error value can be good alternatives. If `NumRows` is greater than the number of rows in the file then the return is "padded" with the value of `ShowMissingsAs`. Likewise, if `NumCols` is greater than the number of columns in the file.
`Encoding`	Allowed entries are `ASCII`, `ANSI`, `UTF-8`, or `UTF-16`. For most files this argument can be omitted and `CSVRead` will detect the file's encoding. If auto-detection does not work, then it's possible that the file is encoded `UTF-8` or `UTF-16` but without a byte order mark to identify the encoding. Experiment with `Encoding` as each of `UTF-8` and `UTF-16`. `ANSI` is taken to mean `Windows-1252` encoding.
`DecimalSeparator`	In many places in the world, floating point number decimals are separated with a comma instead of a period (3,14 vs. 3.14). `CSVRead` can correctly parse these numbers by passing in the `DecimalSeparator` as a comma, in which case comma ceases to be a candidate if the parser needs to guess the `Delimiter`.
`HeaderRow`	This by-reference argument is for use from VBA (as opposed to from Excel formulas). It is populated with the contents of the header row, with no type conversion, though leading and trailing spaces are removed.

source

CSVWrite

Creates a comma-separated file on disk containing Data. Any existing file of the same name is overwritten. If successful, the function returns FileName, otherwise an "error string" (starts with #, ends with !) describing what went wrong.

Public Function CSVWrite(ByVal Data As Variant, Optional ByVal FileName As String, _
          Optional ByVal QuoteAllStrings As Variant = True, Optional ByVal DateFormat As String = "YYYY-MM-DD", _
          Optional ByVal DateTimeFormat As String = "ISO", Optional ByVal Delimiter As String = ",", _
          Optional ByVal Encoding As String = "ANSI", Optional ByVal EOL As String = vbNullString, _
          Optional TrueString As String = "True", Optional FalseString As String = "False") As String

Argument	Description
`Data`	An array of data, or an Excel range. Elements may be strings, numbers, dates, Booleans, empty, Excel errors or null values. `Data` typically has two dimensions, but if `Data` has only one dimension then the output file has a single column, one field per row.
`FileName`	The full name of the file, including the path. Alternatively, if `FileName` is omitted, then the function returns `Data` converted CSV-style to a string.
`QuoteAllStrings`	If `TRUE` (the default) then elements of `Data` that are strings are quoted before being written to file, other elements (Numbers, Booleans, Errors) are not quoted. If `FALSE` then the only elements of `Data` that are quoted are strings containing `Delimiter`, line feed, carriage return or double quote. In both cases, double quotes are escaped by another double quote. If "Raw" then no strings are quoted. Use this option with care, the file written may not be in valid CSV format.
`DateFormat`	A format string that determines how dates, including cells formatted as dates, appear in the file. If omitted, defaults to `yyyy-mm-dd`.
`DateTimeFormat`	Format for datetimes. Defaults to `ISO` which abbreviates `yyyy-mm-ddThh:mm:ss`. Use `ISOZ` for ISO8601 format with time zone the same as the PC's clock. Use with care, daylight saving may be inconsistent across the datetimes in data.
`Delimiter`	The delimiter string, if omitted defaults to a comma. `Delimiter` may have more than one character.
`Encoding`	Allowed entries are `ANSI` (the default), `UTF-8`, `UTF-16`, `UTF-8NOBOM` and `UTF-16NOBOM`. An error will result if this argument is `ANSI` but `Data` contains characters with code point above 127.
`EOL`	Sets the file's line endings. Enter `Windows`, `Unix` or `Mac`. Also supports the line-ending characters themselves (ascii 13 + ascii 10, ascii 10, ascii 13) or the strings `CRLF`, `LF` or `CR`. The default is `Windows` if `FileName` is provided, or `Unix` if not. The last line of the file is written with a line ending.
`TrueString`	How the Boolean value True is to be represented in the file. Optional, defaulting to "True".
`FalseString`	How the Boolean value False is to be represented in the file. Optional, defaulting to "False".

source

Errors

You can call CSVRead and CSVWrite both from spreadsheets (best with Excel 365 and dynamic array formulas) and from VBA. When an error occurs, the functions return a string starting with # and ending with ! which gives an explanation of what went wrong.

So, to get robust error handling from VBA, you should wrap calls to CSVRead and CSVWrite inside calls to a function ThrowIfError:

FileContents = ThrowIfError(CSVRead("c:\path\filename.csv"))

source

An alternative approach is to change the constant m_ErrorStyle (at the top of module modCSVRead) from , es_ReturnString to es_RaiseError, but in that case calls from spreadsheet formulas will return #VALUE! if any error happens, with no description provided.

Testing

CSVRead is tested prior to release against a large collection of test files with comparisons carried out between expected and observed results. You can look at the test code here, or run it yourself if you download and unzip the source code from the latest version, open the workbook VBA-CSV.xlsm from the workbooks folder, and click the "Run Tests" button on the "Tests" worksheet. The tests cover almost 100% of the code in modCSVReadWrite.bas.

Notes

Line endings

CSVRead automatically detects a file's line endings, whether they be Windows (CRLF), Unix (LF) or (pre-OSX) Mac (CR). Files with mixed line-endings are handled by taking any instance of either CRLF, CR or LF (outside quoted regions) to indicate a line ending.

Fractional seconds in dates and times

During type conversion, CSVRead accepts dates and times where the number of seconds includes a decimal fraction. For example the time10:13:20.500 is half a second later than the time 10:13:20. This contrasts with the VBA function CDate where executing CDate("10:13:20.500") results in an error.

Two-digit years

If you use CSVRead to parse files that represent dates with only two digits for the year then you might wish to change the values of two constants in modCSVReadWrite that determine how such dates are interpreted:

'The two constants below set the range of years to which date strings with two-digit years are converted.
Private Const m_2DigitYearIsFrom As Long = 1930
Private Const m_2DigitYearIsTo As Long = 2029

Array lower bounds

The return from CSVRead is an array with lower bounds of one. If you prefer array lower bounds to be zero, then edit the constant m_LBound (at the top of modCSVReadWrite.bas) to be 0 rather than 1.
source

Excel limits on string length

There is a limit on the total number of characters that an Excel cell can contain of 32,767. Therefore, when CSVWrite is called from a worksheet formula with the FileName argument omitted, the function will return an error string if the return would otherwise be longer than 32,767 characters. Similarly CSVRead, when called from a worksheet formula, will return an error string if any individual field to be returned is longer than 32,767.

Compatibility

VBA-CSV works on Windows, but not on Mac, iOS or Android. The host application must be either Excel, Word or PowerPoint. It has been tested on modern (Microsoft 365) versions of these applications, both 64-bit and 32-bit, and on Excel 2013, 32 bit. It should work on all other Office versions (2007 or later) but has not been tested on them.

About

Author: Philip Swannell
Licence: MIT

Name		Name	Last commit message	Last commit date
Latest commit History 362 Commits
.vscode		.vscode
images		images
julia		julia
testfiles		testfiles
testresults		testresults
vba		vba
workbooks		workbooks
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VBA-CSV

Contents

Installation

Acknowledgements

Examples

Read from URL

Write and Read "round-tripping"

Documentation

CSVRead

CSVWrite

Errors

Testing

Notes

Line endings

Fractional seconds in dates and times

Two-digit years

Array lower bounds

Excel limits on string length

Compatibility

About

About

Releases 30

Languages

License

PGS62/VBA-CSV

Folders and files

Latest commit

History

Repository files navigation

VBA-CSV

Contents

Installation

Acknowledgements

Examples

Read from URL

Write and Read "round-tripping"

Documentation

CSVRead

CSVWrite

Errors

Testing

Notes

Line endings

Fractional seconds in dates and times

Two-digit years

Array lower bounds

Excel limits on string length

Compatibility

About

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 30

Languages