Skip to content

Commit

Permalink
Reorganise query execution functions. (#580)
Browse files Browse the repository at this point in the history
Reorganise query execution functions.

Puts all the various ways to execute an SQL command into 3 categories:
1. "exec" functions return `result` (generally).
2. "query" functions wrap "exec" ones, plus string conversion.
3. "stream" functions are like "query" ones, but stream the query.

To make these things fall into place, I'm renaming the recently added
`for_each()` to `for_stream()`, and providing a `for_query()` cousin.

Eventually, I hope `pqxx::result` can just _disappear_ from most
users' consciousness.  The normal ways to execute a query will be...
* _exec0_ just for queries that return no data,
* _query_ functions for small result sets or exotic queries, and
* _stream_ functions for regular queries returning large result sets.

(As a separate effort, I would like to integrate use of parameterised
statements into the regular execution functions, so you just pass some
`pqxx::params` to those basic functions.  Un-parameterised statements
will be nothing but a hidden optimisation.)
  • Loading branch information
jtv authored Jul 6, 2022
1 parent 05a45d2 commit 38cf12a
Show file tree
Hide file tree
Showing 7 changed files with 319 additions and 114 deletions.
7 changes: 6 additions & 1 deletion NEWS
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
7.7.4
- New ways to query a single row! `query01()` and `query1()`.
- `transaction_base::for_each()` is now called `for_stream()`. (#580)
- New `transaction_base::for_query()` is similar, but non-streaming. (#580)
- Query data and iterate directly as client-side types: `query()`. (#580)
- New ways to query a single row! `query01()` and `query1()`. (#580)
- We now have 3 kinds of execution: "exec", "query", and "stream" functions.
- Use C++23 `std::unreachable()` where available.
- `result::iter()` return value now keeps its `result` alive.
7.7.3
- Fix up more damage done by auto-formatting.
- New `result::for_each()`: simple iteration and conversion of rows. (#528)
Expand Down
73 changes: 42 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,16 @@ in standard C++ style (as in `<iostream>` etc.), but an editor will still
recognize them as files containing C++ code.

Continuing the list of classes, you may also need the result class
(`pqxx/result.hxx`). In a nutshell, you create a `connection` based on a
Postgres connection string (see below), create a `work` in the context of that
connection, and run one or more queries on the work which return `result`
objects. The results are containers of rows of data, each of which you can
treat as an array of strings: one for each field in the row. But there are
other ways to query the database.
(`pqxx/result.hxx`). In a nutshell, you create a pqxx::connection based on a
Postgres connection string (see below), create a pqxx::work (a transaction
object) in the context of that connection, and run one or more queries and/or
SQL commands on that.

Depending on how you execute a query, it can return a stream of `std::tuple`
(each representing one row); or a pqxx::result object which holds both the
result data and additional metadata: how many rows your query returned and/or
modified, what the column names are, and so on. A pqxx::result is a container
of pqxx::row, and a pqxx::row is a container of pqxx::field.

Here's an example with all the basics to get you going:

Expand All @@ -111,52 +115,50 @@ Here's an example with all the basics to get you going:
{
// Connect to the database. You can have multiple connections open
// at the same time, even to the same database.
pqxx::connection C;
std::cout << "Connected to " << C.dbname() << '\n';
pqxx::connection c;
std::cout << "Connected to " << c.dbname() << '\n';

// Start a transaction. A connection can only have one transaction
// open at the same time, but after you finish a transaction, you
// can start a new one on the same connection.
pqxx::work W{C};

// Perform a query and retrieve all results.
pqxx::result R{W.exec("SELECT name FROM employee")};
pqxx::work tx{c};

// Iterate over results.
std::cout << "Found " << R.size() << "employees:\n";
for (auto row: R)
std::cout << row[0].c_str() << '\n';
// Query data of two columns, converting them to std::string and
// int respectively. Iterate the rows.
for (auto [name, salary] : tx.query<std::string, int>(
"SELECT name, salary FROM employee ORDER BY name"))
{
std::cout << name << " earns " << salary << ".\n";
}

// For large amounts of data, "streaming" the results is more
// efficient. It does not work for all types of queries though.
// What's really nice is that you don't need to iterate result
// objects. This just converts the fields straight to the C++
// types you need.
//
// You can use std::string_view for fields here, which is not
// You can read fields as std::string_view here, which is not
// something you can do in most places. A string_view becomes
// meaningless when the underlying string ceases to exist. In this
// one situation, you can convert a field to string_view and it
// will be valid for just that one iteration of the loop. The next
// iteration may overwrite or deallocate its buffer space.
for (auto [name, salary] : W.stream<std::string_view, int>(
for (auto [name, salary] : tx.stream<std::string_view, int>(
"SELECT name, salary FROM employee"))
{
std::cout << name << " earns " << salary << ".\n";
}

// Execute a statement (and check that it returns 0 rows of data).
// Execute a statement, and check that it returns 0 rows of data.
// This will throw pqxx::unexpected_rows if the query returns rows.
std::cout << "Doubling all employees' salaries...\n";
W.exec0("UPDATE employee SET salary = salary*2");
tx.exec0("UPDATE employee SET salary = salary*2");

// Easy way to query a value from the database.
int my_salary = W.query_value<int>(
// Shorthand: conveniently query a single value from the database.
int my_salary = tx.query_value<int>(
"SELECT salary FROM employee WHERE name = 'Me'");
std::cout << "I now earn " << my_salary << ".\n";

// Or, query one whole row. This will throw an exception unless
// the result contains exactly 1 row.
auto [top_name, top_salary] = W.query1<std::string, int>(
// Or, query one whole row. This function will throw an exception
// unless the result contains exactly 1 row.
auto [top_name, top_salary] = tx.query1<std::string, int>(
R"(
SELECT salary
FROM employee
Expand All @@ -166,14 +168,23 @@ Here's an example with all the basics to get you going:
std::cout << "Top earner is " << top_name << " with a salary of "
<< top_salary << ".\n";

// Commit the transaction.
// If you need to access the result metadata, not just the actual
// field values, use the "exec" functions. Most of them return
// pqxx::result objects.
pqxx::result res = tx.exec("SELECT * FROM employee");
std::cout << "Columns:\n";
for (pqxx::row_size col = 0; col < res.columns(); ++col)
std::cout << res.column_name(col) << '\n';

// Commit the transaction. If you don't do this, the database will
// undo any changes you made in the transaction.
std::cout << "Making changes definite: ";
W.commit();
tx.commit();
std::cout << "OK.\n";
}
catch (std::exception const &e)
{
std::cerr << e.what() << '\n';
std::cerr << "ERROR: " << e.what() << '\n';
return 1;
}
return 0;
Expand Down
151 changes: 95 additions & 56 deletions include/pqxx/doc/accessing-results.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,111 @@
Accessing results and result rows {#accessing-results}
=================================

When you execute a query using one of the transaction "exec*" functions, you
normally get a `result` object back. A `result` is a container of `row`s.
A query produces a result set consisting of rows, and each row consists of
fields. There are several ways to receive this data.

(There are exceptions: `exec1` expects exactly one row of data, so it returns
just a `row`, not a full `result`. And `exec0` expects no data at all, so it
returns nothing.)
The fields are "untyped." That is to say, libpqxx has no opinion on what their
types are. The database sends the data in a very flexible textual format.
When you read a field, you specify what type you want it to be, and libpqxx
converts the text format to that type for you.

Result objects are an all-or-nothing affair. An `exec*` function waits until
it's received all the result data, and only then will it return. _(There is a
faster, easier way of executing queries with large result sets, so see
"streaming rows" below as well.)_
If a value does not conform to the format for the type you specify, the
conversion fails. For example, if you have strings that all happen to contain
numbers, you can read them as `int`. But if any of the values is empty, or
it's null (for a type that doesn't support null), or it's some string that does
not look like an integer, or it's too large, you can't convert it to `int`.

For example, your code might do:
So usually, reading result data from the database means not just retrieving the
data; it also means converting it to some target type.


Querying rows of data
---------------------

The simplest way to query rows of data is to call one of a transaction's
"query" functions, passing as template arguments the types of columns you want
to get back (e.g. `int`, `std::string`, `double`, and so on) and as a regular
argument the query itself.

You can then iterate over the result to go over the rows of data:

```cxx
pqxx::result r = tx.exec("SELECT * FROM mytable");
for (auto [id, value] :
tx.query<int, std::string>("SELECT id, name FROM item"))
{
std::cout << id << '\t' << value << '\n';
}
```
Now, how do you access the data inside `r`?
The "query" functions execute your query, load the complete result data from
the database, and then as you iterate, convert each row it received to a tuple
of C++ types that you indicated.
There are different query functions for querying any number of rows (`query()`);
querying just one row of data as a `std::tuple` and throwing an error if there's
more than one row (`query1()`); or querying
Streaming rows
--------------
Result sets act as standard C++ containers of rows. Rows act as standard
C++ containers of fields. So the easiest way to go through them is:
There's another way to go through the rows coming out of a query. It's
usually easier and faster if there are a lot of rows, but there are drawbacks.
**One,** you start getting rows before all the data has come in from the
database. That speeds things up, but what happens if you lose your network
connection while transferring the data? Your application may already have
processed some of the data before finding out that the rest isn't coming. If
that is a problem for your application, streaming may not be the right choice.
**Two,** streaming only works for some types of query. The `stream()` function
wraps your query in a PostgreSQL `COPY` command, and `COPY` only supports a few
commands: `SELECT`, `VALUES`, or an `INSERT`, `UPDATE`, or `DELETE` with a
`RETURNING` clause. See the `COPY` documentation here:
[
https://www.postgresql.org/docs/current/sql-copy.html
](https://www.postgresql.org/docs/current/sql-copy.html).
**Three,** when you convert a field to a "view" type (such as
`std::string_view` or `std::basic_string_view<std::byte>`), the view points to
underlying data which only stays valid until you iterate to the next row or
exit the loop. So if you want to use that data for longer than a single
iteration of the streaming loop, you'll have to store it somewhere yourself.
Now for the good news. Streaming does make it very easy to query data and loop
over it:
```cxx
for (auto [id, name, x, y] :
tx.stream<int, std::string_view, float, float>(
"SELECT id, name, x, y FROM point"))
process(id + 1, "point-" + name, x * 10.0, y * 10.0);
```

The conversion to C++ types (here `int`, `std::string_view`, and two `float`s)
is built into the function. You never even see `row` objects, `field` objects,
iterators, or conversion methods. You just put in your query and you receive
your data.



Results with metadata
---------------------

Sometimes you want more from a query result than just rows of data. You may
need to know right away how many rows of result data you received, or how many
rows your `UPDATE` statement has affected, or the names of the columns, etc.

For that, use the transaction's "exec" query execution functions. Apart from a
few exceptions, these return a `pqxx::result` object. A `result` is a container
of `pqxx::row` objects, so you can iterate them as normal, or index them like
you would index an array. Each `row` in turn is a container of `pqxx::field`,
Each `field` holds a value, but doesn't know its type. You specify the type
when you read the value.

For example, your code might do:

```cxx
pqxx::result r = tx.exec("SELECT * FROM mytable");
for (auto const &row: r)
{
for (auto const &field: row) std::cout << field.c_str() << '\t';
Expand Down Expand Up @@ -116,45 +197,3 @@ This becomes really helpful with the array-indexing operator. With regular
C++ iterators you would need ugly expressions like `(*row)[0]` or
`row->operator[](0)`. With the iterator types defined by the result and
row classes you can simply say `row[0]`.


Streaming rows
--------------

There's another way to go through the rows coming out of a query. It's
usually easier and faster, but there are drawbacks.

**One,** you start getting rows before all the data has come in from the
database. That speeds things up, but what happens if you lose your network
connection while transferring the data? Your application may already have
processed some of the data before finding out that the rest isn't coming. If
that is a problem for your application, streaming may not be the right choice.

**Two,** streaming only works for some types of query. The `stream()` function
wraps your query in a PostgreSQL `COPY` command, and `COPY` only supports a few
commands: `SELECT`, `VALUES`, or an `INSERT`, `UPDATE`, or `DELETE` with a
`RETURNING` clause. See the `COPY` documentation here:
[
https://www.postgresql.org/docs/current/sql-copy.html
](https://www.postgresql.org/docs/current/sql-copy.html).

**Three,** when you convert a field to a "view" type (such as
`std::string_view` or `std::basic_string_view<std::byte>`), the view points to
underlying data which only stays valid until you iterate to the next row or
exit the loop. So if you want to use that data for longer than a single
iteration of the streaming loop, you'll have to store it somewhere yourself.

Now for the good news. Streaming does make it very easy to query data and loop
over it:

```cxx
for (auto [id, name, x, y] :
tx.stream<int, std::string_view, float, float>(
"SELECT id, name, x, y FROM point"))
process(id + 1, "point-" + name, x * 10.0, y * 10.0);
```
The conversion to C++ types (here `int`, `std::string_view`, and two `float`s)
is built into the function. You never even see `row` objects, `field` objects,
iterators, or conversion methods. You just put in your query and you receive
your data.
6 changes: 3 additions & 3 deletions include/pqxx/doc/streams.md
Original file line number Diff line number Diff line change
Expand Up @@ -55,9 +55,9 @@ then you begin processing. With `stream_from` you can be processing data on
the client side while the server is still sending you the rest.

You don't actually need to create a `stream_from` object yourself, though you
can. Two shorthand functions, @ref pqxx::transaction_base::stream
and @ref pqxx::transaction_base::for_each, can create the streams for you with
a minimum of overhead.
can if you want to. Two shorthand functions,
@ref pqxx::transaction_base::stream and @ref pqxx::transaction_base::for_stream,
can each create the streams for you with a minimum of overhead.

Not all kinds of queries will work in a stream. Internally the streams make
use of PostgreSQL's `COPY` command, so see the PostgreSQL documentation for
Expand Down
2 changes: 1 addition & 1 deletion include/pqxx/internal/result_iter.hxx
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ public:
iterator end() const { return {}; }

private:
pqxx::result const &m_home;
pqxx::result const m_home;
};
} // namespace pqxx::internal

Expand Down
Loading

0 comments on commit 38cf12a

Please sign in to comment.