sqlBatchInsertR
provides an efficient way of inserting a data frame into a database table and handling SQL DML statements. It is built on top of the RODBC
package and aims to extend its functionality.
This has currently been tested for use with Microsoft SQL Server and is in development.
In the summer of 2016 I volunteered at a non-profit community health clinic not knowing it would be the start to a full-time career. My first big project was to compare monthly Excel records to records in a Microsoft SQL Server database. To accomplish the task, I used R and the RODBC
package.
The next phase was to track the Excel records in more detail. This involved data modeling and designing an ETL process. One of the challenges at this point was the data frame to database writing time. RODBC
offers sqlSave()
, but it couldn't keep up with the demand. I wasn't the only one that noticed; others raised this issue on Stack Overflow, but no coded solution was proposed.
This is when I had the inspiration to write my own functions. I started writing the code in 2017 until I reached a solution of creating batches of parameterized INSERT statements. It was a major speed up and I became reliant on it for other projects. The code lived in a few scripts until I started building it out as an internal package in 2018.
It's now 2021 and I've decided to open source the package in hopes it'll be useful to others. What's nice about it is that it's minimal. Only base R is used for data manipulation and RODBC
for database connectivity.
-
Install
RODBC
if necessary. In R, copy the following code into the console and press Enter.install.packages("RODBC")
-
Download the source package (tar.gz) of the current release here.
-
In R, copy the following code into the console.
install.packages("sqlBatchInsertR_#.tar.gz", repos = NULL, type = "source")
-
Change the first argument to be the path of the downloaded (tar.gz) file and press Enter.
The R package devtools
allows users to install packages directly from GitHub. In R, copy the following code into the console and press Enter.
# install devtools if necessary
install.packages("devtools")
# install sqlBatchInsertR and dependent package from the GitHub repository
devtools::install_github("nicholas-alonzo/sqlBatchInsertR")
-
Install Git if necessary.
-
Open a terminal.
-
Change the current working directory to the location where you want the cloned directory.
-
Copy the following command into terminal and press Enter. This will make a full copy of the repository data that GitHub has at that point in time, including all versions of every file and folder for the project.
git clone https://github.com/nicholas-alonzo/sqlBatchInsertR
-
Copy the following command into terminal and press Enter. This will install the source package to your current version of R.
R CMD INSTALL sqlBatchInsertR
This has currently been tested for use with Microsoft SQL Server and is in development.
Below are the main functions of the package.
sqlBatchInsert(dbconn, df, dbtable, nrecords = 1000)
- Inserts a data frame into a database table in a transaction using a list of parameterized INSERT statements.
sqlTransaction(dbconn, statement, prompt = TRUE, addl_text = NULL)
- Prompts the user to commit or rollback the results of a SQL INSERT, UPDATE, or DELETE statement.
sqlRowsAffected()
- Returns the number of rows affected in a database table when
sqlBatchInsert()
orsqlTransaction()
are successfully executed.
sqlOdbcErrors()
- Returns a list of ODBC errors after
sqlBatchInsert()
orsqlTransaction()
fails to execute.
Please refer to the Wiki for additional documentation including use cases, benchmarks, and limitations.
- 🦾 Share ideas for feature enhancements here.
‼️ Raise issues regarding functionality and share solutions here.- 📓 Chat about anything and everything here.
- ✋ Ask for help and share solutions here.
- 💌 Share with me if this package is helpful here.
sqlBatchInsertR is released under the GNU General Public License v2.0.