Skip to content

Commit

Permalink
1.1.0 -- json unescaping, csv output mode
Browse files Browse the repository at this point in the history
  • Loading branch information
micha committed Feb 21, 2016
1 parent 05eb966 commit 2ebf751
Show file tree
Hide file tree
Showing 6 changed files with 125 additions and 5 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
.PHONY: all clean docs install dist

VERSION = 1.0.1
VERSION = 1.1.0
CFLAGS = -O3
LDFLAGS = -static
PREFIX = /usr/local
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,13 @@ elb-2 i-b910a256
Linux users can install prebuilt binaries from the release tarball:

```
sudo bash -c "cd /usr/local && wget -O - https://github.com/micha/json-table/releases/download/1.0.1/jt-1.0.1.tar.gz | tar xzvf -"
sudo bash -c "cd /usr/local && wget -O - https://github.com/micha/json-table/releases/download/1.1.0/jt-1.1.0.tar.gz | tar xzvf -"
```

Otherwise, to build from source:

```
git checkout 1.0.1 && make && sudo make install
git checkout 1.1.0 && make && sudo make install
```

## Documentation
Expand Down
37 changes: 37 additions & 0 deletions jt.1
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,43 @@ If the item at the top of the data stack is not an object or if the object has n
.IP
If the \fIKEY\fR property of the object is an array subsequent commands will operate on one of the items in the array, chosen automatically by \fBjt\fR\. The array index will be available to subsequent commands via the index stack\.
.
.SH "JSON UNESCAPING AND CSV OUTPUT"
Strings in JSON data must not contain control characters (e\.g\., \fBtab\fR, \fBnewline\fR, etc\.) These characters \fImust\fR be escaped with a backslash\. Additionally, any character \fImay\fR be escaped with a backslash\. The JSON specification also allows escaping of unicode characters with \fB\eu\fR escape, for example the copyright symbol © can be encoded as \fB\eu00A9\fR, and the G\-clef character 𝄞 as \fB\euD834\euDD1E\fR\.
.
.P
Numbers may be expressed in a number of ways in JSON data, and there is a single \fBNumber\fR type that encompasses both integer and floating point\. Both decimal and exponential notation are valid in JSON\.
.
.SS "Strings"
\fBJt\fR does not unescape string values by default, in case they contain tab or newline characters that would break the tabular output format\. If unescaped values are needed this can be achieved by invoking \fBjt\fR with the \fB\-u\fR option in post processing\. For example:
.
.IP "" 4
.
.nf

$ jt \-u \'i love music \eu266A\'
i love music ♪
.
.fi
.
.IP "" 0
.
.SS "Numbers"
\fBJt\fR does not process numbers in any way \(em they are printed in the output verbatim, as they appear in the JSON input\. If special processing is required the \fBprintf\fR program in coreutils is your friend:
.
.IP "" 4
.
.nf

$ printf %\.0f 2\.99792458e9
2997924580
.
.fi
.
.IP "" 0
.
.SS "CSV Output"
The CSV format uses quoted values, which avoids the problems associated with values that contain tab and newline characters\. The \fB\-c\fR option puts \fBjt\fR into CSV output mode\. In this mode JSON strings are unescaped by default\. The \fBcsvtool\fR program and \fBcsvkit\fR suite of tools facilitate processing of CSV data in the shell\.
.
.SH "EXAMPLES"
We will use the following JSON input for the examples:
.
Expand Down
43 changes: 43 additions & 0 deletions jt.1.html

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

40 changes: 40 additions & 0 deletions jt.1.ronn
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,46 @@ The following commands are available:
The array index will be available to subsequent commands via the index
stack.

## JSON UNESCAPING AND CSV OUTPUT

Strings in JSON data must not contain control characters (e.g., `tab`,
`newline`, etc.) These characters _must_ be escaped with a backslash.
Additionally, any character _may_ be escaped with a backslash. The JSON
specification also allows escaping of unicode characters with `\u` escape,
for example the copyright symbol © can be encoded as `\u00A9`, and the G-clef
character 𝄞 as `\uD834\uDD1E`.

Numbers may be expressed in a number of ways in JSON data, and there is a
single `Number` type that encompasses both integer and floating point. Both
decimal and exponential notation are valid in JSON.

### Strings

**Jt** does not unescape string values by default, in case they contain
tab or newline characters that would break the tabular output format. If
unescaped values are needed this can be achieved by invoking **jt** with the
`-u` option in post processing. For example:

$ jt -u 'i love music \u266A'
i love music ♪

### Numbers

**Jt** does not process numbers in any way — they are printed in the
output verbatim, as they appear in the JSON input. If special processing is
required the `printf` program in coreutils is your friend:

$ printf %.0f 2.99792458e9
2997924580

### CSV Output

The CSV format uses quoted values, which avoids the problems associated with
values that contain tab and newline characters. The `-c` option puts **jt**
into CSV output mode. In this mode JSON strings are unescaped by default. The
`csvtool` program and `csvkit` suite of tools facilitate processing of CSV
data in the shell.

## EXAMPLES

We will use the following JSON input for the examples:
Expand Down
4 changes: 2 additions & 2 deletions jt.c
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#define _GNU_SOURCE
#define JSMN_STRICT
#define JSMN_PARENT_LINKS
#define JT_VERSION "1.0.1"
#define JT_VERSION "1.1.0"

#include <stdarg.h>
#include <stdio.h>
Expand Down Expand Up @@ -219,7 +219,7 @@ unsigned long utf_tag[4] = { 0x00, 0xc0, 0xe0, 0xf0 };

void encode_u_escaped(char **in, char **out) {
unsigned long p = read_code_point(in);
int len = (p < 0x80) ? 1 : ((p < 0x800) ? 2 : ((p < 0x10000) ? 3 : 4));
int len = (p < 0x80) ? 1 : (p < 0x800) ? 2 : (p < 0x10000) ? 3 : 4;
*out += len;
switch (len) {
case 4: *--(*out) = ((p | 0x80) & 0xbf); p >>= 6;
Expand Down

0 comments on commit 2ebf751

Please sign in to comment.