Skip to content

Latest commit

 

History

History
113 lines (73 loc) · 5.14 KB

README.md

File metadata and controls

113 lines (73 loc) · 5.14 KB

sqlBatchInsertR

sqlBatchInsertR provides an efficient way of inserting a data frame into a database table and handling SQL DML statements. It is built on top of the RODBC package and aims to extend its functionality.

This has currently been tested for use with Microsoft SQL Server and is in development.

Inspiration and Motivation

In the summer of 2016 I volunteered at a non-profit community health clinic not knowing it would be the start to a full-time career. My first big project was to compare monthly Excel records to records in a Microsoft SQL Server database. To accomplish the task, I used R and the RODBC package.

The next phase was to track the Excel records in more detail. This involved data modeling and designing an ETL process. One of the challenges at this point was the data frame to database writing time. RODBC offers sqlSave(), but it couldn't keep up with the demand. I wasn't the only one that noticed; others raised this issue on Stack Overflow, but no coded solution was proposed.

This is when I had the inspiration to write my own functions. I started writing the code in 2017 until I reached a solution of creating batches of parameterized INSERT statements. It was a major speed up and I became reliant on it for other projects. The code lived in a few scripts until I started building it out as an internal package in 2018.

It's now 2021 and I've decided to open source the package in hopes it'll be useful to others. What's nice about it is that it's minimal. Only base R is used for data manipulation and RODBC for database connectivity.

Prerequisites

  • R (>= 4.0.0)
  • RODBC (>= 1.3.0)

Installation

  1. Install RODBC if necessary. In R, copy the following code into the console and press Enter.

    install.packages("RODBC")

In Release: Using Source Package

  1. Download the source package (tar.gz) of the current release here.

  2. In R, copy the following code into the console.

    install.packages("sqlBatchInsertR_#.tar.gz", repos = NULL, type = "source")
  3. Change the first argument to be the path of the downloaded (tar.gz) file and press Enter.

In Development: Using devtools

The R package devtools allows users to install packages directly from GitHub. In R, copy the following code into the console and press Enter.

# install devtools if necessary
install.packages("devtools")

# install sqlBatchInsertR and dependent package from the GitHub repository
devtools::install_github("nicholas-alonzo/sqlBatchInsertR")

In Development: Using Command Line

  1. Install Git if necessary.

  2. Open a terminal.

  3. Change the current working directory to the location where you want the cloned directory.

  4. Copy the following command into terminal and press Enter. This will make a full copy of the repository data that GitHub has at that point in time, including all versions of every file and folder for the project.

    git clone https://github.com/nicholas-alonzo/sqlBatchInsertR
    
  5. Copy the following command into terminal and press Enter. This will install the source package to your current version of R.

    R CMD INSTALL sqlBatchInsertR
    

Usage and Features

This has currently been tested for use with Microsoft SQL Server and is in development.

Below are the main functions of the package.

sqlBatchInsert(dbconn, df, dbtable, nrecords = 1000)
  • Inserts a data frame into a database table in a transaction using a list of parameterized INSERT statements.
sqlTransaction(dbconn, statement, prompt = TRUE, addl_text = NULL)
  • Prompts the user to commit or rollback the results of a SQL INSERT, UPDATE, or DELETE statement.
sqlRowsAffected()
  • Returns the number of rows affected in a database table when sqlBatchInsert() or sqlTransaction() are successfully executed.
sqlOdbcErrors()
  • Returns a list of ODBC errors after sqlBatchInsert() or sqlTransaction() fails to execute.

Documentation

Please refer to the Wiki for additional documentation including use cases, benchmarks, and limitations.

Contributing

  • 🦾 Share ideas for feature enhancements here.
  • ‼️ Raise issues regarding functionality and share solutions here.
  • 📓 Chat about anything and everything here.
  • ✋ Ask for help and share solutions here.
  • 💌 Share with me if this package is helpful here.

License

sqlBatchInsertR is released under the GNU General Public License v2.0.