SQLCell

Bringing together the power of SQL, Python, and Javascript. Run raw, multi-threaded SQL in an IPython Notebook while concurrently executing python cells, and a ton of other features.

See the tutorials here:

Part 1: The Basics

Part 2: Beyond The Basics

• Most recent feature
• Installation and Configuration
• Features
i. Parameters
ii. Flags
iii. Pass python variables to SQL
iv. psql metacommands
v. Multi-threading
vi. Buttons
vii. Inline editing
viii. Easy-to-read Query Plan Table
ix. Easy-to-read Query Plan Graph
x. Alter column name and type via the UI
xi. Notifications
xii. pg_dump support
xiii. Switch Engines (To be documented...)
• In development
i. Built-in PostGIS preview (inspired by postgis-preview
ii. MSSQL support.
• To dos
i. Add UI elements to perform basic, common database tasks, such as adding columns, creating tables, etc.
ii. Need to confirm install process is smooth on non-Mac computers.
iii. Add support for MySQL commands.
iv. Add modifiers and constraints via the UI

Open issues can be found here.

Most recent feature

added 12/04/2016

SQLCell now offers the option to view the Query Plan as a sankey graph built with D3.js

Installation and configuration

Just clone the repo and cp the sqlcell_app.py file to Jupyter's startup directory (on my computer, the directory is ~/.ipython/profile_default/startup, but may be different depending on your OS and version of IPython/Jupyter):

$ cd .ipython/profile_default/startup # or wherever your startup directory is
$ git clone https://github.com/tmthyjames/SQLCell.git
$ cp SQLCell/sqlcell_app.py sqlcell_app.py # place sqlcell_app.py in the startup folder so it will get executed

Then in the engine_config.py file, define your connection variables. If you don't add them to engine_config.py, then you'll have to pass a connection string to the ENGINE parameter everytime you use %%sql, like so:

In [2]: %%sql ENGINE='postgresql://username:password@host:port/database'
        SELECT * FROM table;

To save the engines:

%%sql --declare_engines new
LOCAL=postgresql://username:password@localhost:5432/
DEV=postgresql://username:password@random.domain.com/

See more about this option in the Declare Engines section

Now you are ready to ditch pgAdmin or whatever SQL interface you use. Continue reading to see all the available options, like writing results to a CSV, using SQLAlchemy named parameters and more.

Features

Parameters

Available parameters:
• DB: Determines what database to query. On the first run, this parameter is required. After that, it will remember what database was chosen. To change databases, use this parameter again. Default is the last-specificed database.
• PATH: Writes results of the query to a CSV (can also be done through the UI). No default.
• MAKE_GLOBAL: Passes results of the query to the variable you pass to it. If this parameter is specified but the RAW parameter is not, then the results will be a Pandas DataFrame. If RAW is set to True, then the results will be the raw RowProxy returned from the database. No Default.
• RAW: Determines whether the data will be of type DataFrame or RowProxy. Default: False.
• DISPLAY: Determines whether or not to render the results as a table. This is best used in conjunction with the MAKE_GLOBAL parameter because displaying a table in a busy workflow can be cumbersome and annoying sometimes.
• ENGINE: Speicifies what host, database to connect to. Default is the connection that is specified in the engine_config.py file. If the engine_config.py file is not configured, then the ENGINE parameter is required.
• TRANSACTION_BLOCK': Determines whether the query will be executed inside a transaction block or not. This is useful when creating a database, dropping a database, VACUUM ANALYZEing a database, or any other query statements that cannot be run inside a transaction block. Default: True
• EDIT: Enables inline editing. To use this, you must specify only one table in your query, and that table must have a primary key. Default: False.
• NOTIFY: Disables notifications for finished queries. Default: True.

Examples of how to use these are below.

`DB` Parameter

After adding your connection details to engines.py, run your first query with the DB argument:

In [3]: %%sql DB=bls
        SELECT * 
        FROM la_unemployment
        LIMIT 3

	series_id	year	period	value	footnote_codes
1	LASST470000000000003	1976	M01	6.2	None
2	LASST470000000000003	1976	M02	6.1	None
3	LASST470000000000003	1976	M03	6.0	None

For the rest of the session, you won't have to use the DB argument unless you want to change databases. And the last-used DB will be persisted even after you shut down Jupyter and start it back up next time.

In [4]: %%sql
        SELECT * 
        FROM avg_price LIMIT 3

	series_id	year	period	value
1	APU0000701111	1980	M01	0.203
2	APU0000701111	1980	M02	0.205
3	APU0000701111	1980	M03	0.211

To switch databases, just invoke the DB argument again with a different database:

In [5]: %%sql DB=sports
        SELECT * 
        FROM nba LIMIT 3

	dateof	team	opp	pts	fg	fg_att	ft	ft_att	fg3	fg3_att	off_rebounds	def_rebounds	asst	blks	fouls	stls	turnovers
1	2015-10-27	DET	ATL	106	37	96	20	26	12	29	23	36	23	3	15	5	15
2	2015-10-27	ATL	DET	94	37	82	12	15	8	27	7	33	22	4	25	9	15
3	2015-10-27	CLE	CHI	95	38	94	10	17	9	29	11	39	26	7	21	5	11

`PATH` Parameter

To write the data to a CSV, use the PATH argument:

In [6]: %%sql PATH='/<path>/<to>/<file>.csv'
        SELECT * 
        FROM nba LIMIT 3

	dateof	team	opp	pts	fg	fg_att	ft	ft_att	fg3	fg3_att	off_rebounds	def_rebounds	asst	blks	fouls	stls	turnovers
1	2015-10-27	DET	ATL	106	37	96	20	26	12	29	23	36	23	3	15	5	15
2	2015-10-27	ATL	DET	94	37	82	12	15	8	27	7	33	22	4	25	9	15
3	2015-10-27	CLE	CHI	95	38	94	10	17	9	29	11	39	26	7	21	5	11

`MAKE_GLOBAL` parameter

And my favorite. You can assign the dataframe to a variable like this useing the MAKE_GLOBAL argument:

In [9]: %%sql MAKE_GLOBAL=WHATEVER_NAME_YOU_WANT DB=bls
        SELECT * 
        FROM la_unemployment
        WHERE year = 1976
            AND period = 'M01'
        LIMIT 3

	series_id	year	period	value	footnote_codes
1	LASST470000000000003	1976	M01	6.2	None
2	LASST470000000000004	1976	M01	111152.0	None
3	LASST470000000000005	1976	M01	1691780.0	None

And call the variable:

In [10]: WHATEVER_NAME_YOU_WANT

	series_id	year	period	value	footnote_codes
1	LASST470000000000003	1976	M01	6.2	None
2	LASST470000000000004	1976	M01	111152.0	None
3	LASST470000000000005	1976	M01	1691780.0	None

`RAW` Parameter

You can also return the raw RowProxy from SQLAlchemy by setting the RAW argument to True and using the MAKE_GLOBAL argument.

In [10]: %%sql MAKE_GLOBAL=data RAW=True
         SELECT * 
         FROM la_unemployment
         LIMIT 3

In [11]: data
         [(u'LASST470000000000003', 1976, u'M01', 6.2, None),
          (u'LASST470000000000003', 1976, u'M02', 6.1, None),
          (u'LASST470000000000003', 1976, u'M03', 6.0, None)]

`DISPLAY` Parameter

Query the data without rendering the table (useful if the result set is prohibitively large and displaying the table breaks things) by setting the DISPLAY parameter to False. It makes sense to use this parameter in conjunction with the MAKE_GLOBAL parameter so the data is passed to the variable but the table isn't rendered:

In [10]: %%sql MAKE_GLOBAL=data DISPLAY=False
         SELECT * 
         FROM la_unemployment

`ENGINE` Parameter

The ENGINE parameter accepts any connection string and creates a connection based on that.

In [10]: %%sql ENGINE='postgresql://username:password@host:port/DB'
         SELECT * 
         FROM la_unemployment
         LIMIT 3

`TRANSACTION_BLOCK` Parameter

Some SQL statements (VACUUM, CREATE <db>, DROP <db>, etc.) must be executed outside of a transaction block by setting the isolation level to 0 (see this)

In [10]: %%sql TRANSACTION_BLOCK=False
         VACUUM ANALYZE <table_name>

`EDIT` Parameter

Enables inline editing.

In [10]: %%sql EDIT=True
         SELECT * 
         FROM la_unemployment
         LIMIT 3

Will display a table where the cells can be clicked on and edited.

`NOTIFY` Parameter

In [11]: %%sql NOTIFY=False
         SELECT * FROM la_unemployment LIMIT 1

Will disable notifications for the remainder of your Jupyter session. To re-enable notifications, just set NOTIFY=True.

Flags

• declare_engines: Makes adding engines to the engines.py file easy.
• pg_dump: Run --pg_dump commands from your Jupyter Notebook.

Declare Engines

All engines should be in this format: name=connection_string. For example:

%%sql --declare_engines new
LOCAL=postgresql://username:password@localhost:5432/
DEV=postgresql://username:password@random.domain.com/

Will create the ENGINES object with only LOCAL and DEV in it. "LOCAL" will be the text that goes on the button, like the following:

To append new engines to an existing ENGINES object:

%%sql --declare_engines append
LOCAL_test=postgresql://username:password@localhost:5432/
DEV_test=postgresql://username:password@random.domain.com/

Where I had the engines LOCAL, DEV, and PROD I now have LOCAL_test and DEV_test also.

`pg_dump` support

In[17]: %%sql --pg_dump
        -t nba sports --schema-only

Will output the following:

SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;
SET row_security = off;
SET search_path = public, pg_catalog;
SET default_tablespace = '';
SET default_with_oids = false;
--
-- Name: nba; Type: TABLE; Schema: public; Owner: postgres
--
CREATE TABLE nba (
    dateof date,
    team character varying(5),
    opp character varying(5),
    pts bigint,
    fg integer,
    fg_att integer,
    ft integer,
    ft_att integer,
    fg3 integer,
    fg3_att integer,
    off_rebounds integer,
    def_rebounds integer,
    asst integer,
    blks integer,
    fouls integer,
    stls integer,
    turnovers integer
);
ALTER TABLE nba OWNER TO postgres;

Pass Python variables to SQL

To pass python variables to your queries, just do the following.

In[7]: # define your parameters in a python cell
        name = '1976'
        period = 'M01'
        series_id = ('LASST470000000000005', 'LASST470000000000004', 'LASST470000000000003')

Now in a %%sql cell:

In [8]: %%sql DB=bls
        SELECT * 
        FROM la_unemployment
        WHERE year = %(year)s
            AND period = %(period)s AND series_id IN %(series_id)s
        LIMIT 3

You can also use a colon to indicate your variables:

In [8]: %%sql DB=bls
        SELECT * 
        FROM la_unemployment
        WHERE year = :year
            AND period = :period AND series_id IN :series_id
        LIMIT 3

Both output the following table:

	series_id	year	period	value	footnote_codes
1	LASST470000000000003	1976	M01	6.2	None
2	LASST470000000000004	1976	M01	111152.0	None
3	LASST470000000000005	1976	M01	1691780.0	None

`psql` metacommands

In [1]: %%sql DB=bls
        \dp

	Schema	Name	Type	Access privileges	Column privileges	Policies
1	public	avg_price	table	nan	nan	nan
2	public	la_unemployment	table	nan	nan	nan
3	public	tu_atus	table	nan	nan	nan

In [2]: %%sql
        \d avg_price

	Column	Type	Modifiers
1	series_id	character varying(17)	nan
2	year	integer	nan
3	period	character varying(3)	nan
4	value	real	nan

In [3]: %%sql DB=sports
        \COPY public.nba (dateof, team, opp, pts, fouls) to '/<path>/<to>/<file>.csv'
	
Out[3]: <p>COPY 3092</p>

Multi-threading

All queries are executed on their own thread, so you can run as many queries as your box will allow while concurrently executing python code.

Buttons

Buttons include
• Viewing Query Plan with d3.js sankey graph
• Running Explain Analyze on your query
• executing query
• executing query and returning SQLAlchemy results in a variable
• saving to a TSV
• stopping query
• swithcing between user-defined engines (button group on right; see Declare Engines for instructions on how to define engines.)

Inline editing

Set the EDIT parameter to True to enable inline editing. As long as you are querying one table and that table has a primary key, then you can edit it using the UI.

Easy to read Query Plan table

This includes a heatmap-like color scale to indicate problem spots in your query.

Easy to read Query Plan graph

Sankey graph that uses a heatmap-like color scale to indicate problem spots in your query, built with D3.js.

Alter Column Name and Type Inline

To edit the column info via the UI, use the \d <table-name> metacommand and the EDIT parameter.

In[15]: %%sql DB=sports EDIT=True
        \d nba

Notifications

SQLCell now includes "Growl"-like, Bootstrap-styled notifications using mouse0270's awesome bootstrap-notify. The entire query is in the pre tag and scrollable, and clicking the notification will focus the window on the results of that query.

And that's it.

Enjoy and contribute.

Name		Name	Last commit message	Last commit date
Latest commit History 267 Commits
css		css
data		data
engines		engines
images		images
js		js
python_js		python_js
tasks		tasks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
sqlcell.py		sqlcell.py
sqlcell_app.py		sqlcell_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SQLCell

Contents

Most recent feature

Installation and configuration

Features

Parameters

`DB` Parameter

`PATH` Parameter

`MAKE_GLOBAL` parameter

`RAW` Parameter

`DISPLAY` Parameter

`ENGINE` Parameter

`TRANSACTION_BLOCK` Parameter

`EDIT` Parameter

`NOTIFY` Parameter

Flags

Declare Engines

`pg_dump` support

Pass Python variables to SQL

`psql` metacommands

Multi-threading

Buttons

Inline editing

Easy to read Query Plan table

Easy to read Query Plan graph

Alter Column Name and Type Inline

Notifications

About

Releases

Packages

Languages

License

CJinny/SQLCell

Folders and files

Latest commit

History

Repository files navigation

SQLCell

Contents

Most recent feature

Installation and configuration

Features

Parameters

DB Parameter

PATH Parameter

MAKE_GLOBAL parameter

RAW Parameter

DISPLAY Parameter

ENGINE Parameter

TRANSACTION_BLOCK Parameter

EDIT Parameter

NOTIFY Parameter

Flags

Declare Engines

pg_dump support

Pass Python variables to SQL

psql metacommands

Multi-threading

Buttons

Inline editing

Easy to read Query Plan table

Easy to read Query Plan graph

Alter Column Name and Type Inline

Notifications

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

`DB` Parameter

`PATH` Parameter

`MAKE_GLOBAL` parameter

`RAW` Parameter

`DISPLAY` Parameter

`ENGINE` Parameter

`TRANSACTION_BLOCK` Parameter

`EDIT` Parameter

`NOTIFY` Parameter

`pg_dump` support

`psql` metacommands

Packages