This standalone tool aims to faciliate data manulplations that are usually not intuitive to average Excel users and sometimes considered troublesome even to more advanced Excel users. Examples include table join and text aggregation, these operations are yet to be fully addressed in basic Excel functions.
This tool can work with 2 datasets at a time. The user copies data from a spreadsheet to this tool via the clipboard, without going through manual data import process at all. It can also read other tabular data formats, such as a table (non-image) shown on a webpage.
It is not complicated to operate this tool in practice. Every functions are readily accessible on a user interface with a streamlined design. To start an operation, the first step is to copy the source data (including the column header), and then press Ctrl + V on either the left or the right dataset container on the main window. After the data shows up in the dataset container, press one of the the square button to run a specific operation.
This tools offers the following capilities:
-
It is recommended to remove all formats before copying from a Excel spreadsheet. This is because the clipboard always copies displayed values and disregards underlying values. This behavior leads to incorrect data type or loss of precision, which may produce unintended results if not carefully inspected. For instance,
Underlying Value Displayed Value 2.55 2.6 0.901 90% 31-Jan-21 31-Jan 232,000,000,000 2.32e+11 -
While pasting the output to Excel, padding zero could be lost (e.g. "01" becomes "1") if cell type in a worksheet is not formatted as "Text". To work-around, select the entire column in Excel, choose "Format Cells" from the menu and then click on "Text".
-
To aviod system instability cause by low memory condition, a limitation on data size (maximum number of rows and columns) has been imposed as a prudent measure to prevent excessive large data from being read into memory. User can adjust the limits by pressing the 'setting' button in the main window. It should be noted that extending the limits beyond capacity could significantly degrade performance (impact varies on individual machines).
This tool is designed to run with minimal dependencies. Neverthelsess, a few standard libraries is required to be pre-installed.
- Python version 3.8
- Pandas library
- NumPy library
Folder/File | Purpose |
---|---|
res\ | image library folder |
main.py | the main entry point of this tool |
mod*.py | modules consisting of common classes and functions |
operation.py | the specific module for all runnable operations |
setting.py | the configuration file |