Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Table size is not shown #83

Open
dpxcc opened this issue Jan 16, 2025 · 7 comments
Open

Table size is not shown #83

dpxcc opened this issue Jan 16, 2025 · 7 comments
Labels
Milestone

Comments

@dpxcc
Copy link
Contributor

dpxcc commented Jan 16, 2025

What feature are you requesting?

Table size is not shown in \d+

postgres (pid: 97139) =# CREATE TABLE c (a int) USING columnstore;
CREATE TABLE
postgres (pid: 97139) =# INSERT INTO c VALUES (1), (2), (3);
INSERT 0 3
postgres (pid: 97139) =# CREATE TABLE r (a int);
CREATE TABLE
postgres (pid: 97139) =# INSERT INTO r VALUES (1), (2), (3);
INSERT 0 3
postgres (pid: 97139) =# \d;
invalid command \d;
Try \? for help.
postgres (pid: 97139) =# \d+
                                     List of relations
 Schema | Name | Type  |  Owner   | Persistence | Access method |    Size    | Description 
--------+------+-------+----------+-------------+---------------+------------+-------------
 public | c    | table | postgres | permanent   | columnstore   | 0 bytes    | 
 public | r    | table | postgres | permanent   | heap          | 8192 bytes | 
(2 rows)
@dpxcc dpxcc added feature good first issue Good for newcomers labels Jan 16, 2025
@dentiny
Copy link
Contributor

dentiny commented Jan 17, 2025

Hi may I ask a question on this ticket? I took a quick look and found there're two functions getting table size.
One is calculate_relation_size: https://github.com/postgres/postgres/blob/43830ecb8a9b6a1bc322298a77a5e0d87c2e83d4/src/backend/utils/adt/dbsize.c#L319-L361
Another is table_relation_size: https://github.com/postgres/postgres/blob/43830ecb8a9b6a1bc322298a77a5e0d87c2e83d4/src/include/access/tableam.h#L1863-L1876

After attach gdb to the process, I found user relation (which is created by columnstore and TAM defined) goes to calculate_relation_size, which checks local filesystem;
while system tables (i.e. pg_class, pg_description, etc) routes to table_relation_size, which in turn calls function defined by TAM.

So the task for this ticket is somehow to make \d+ invokes table_relation_size, am I understanding correctly? Thank you!

@dpxcc
Copy link
Contributor Author

dpxcc commented Jan 18, 2025

That's a great question!
Initially, I thought the task was simply to change columnstore_relation_size() to return the total size of all Parquet files. However, as you pointed out, \d+ doesn’t actually call columnstore_relation_size() at all
I don’t have an immediate answer and will need to examine Postgres code more closely. However, I did notice that citus has somehow made this work, using pg_total_relation_size() (https://github.com/citusdata/citus/tree/main/src/backend/columnar#compression-ratio). So, at the very least, we can study their implementation

@dentiny
Copy link
Contributor

dentiny commented Jan 18, 2025

I checked functions for object management: https://www.postgresql.org/docs/9.4/functions-admin.html#FUNCTIONS-ADMIN-DBSIZE, seems all functions are called calculate_relation_size, which gets the file size in local filesystem rather than mooncake TAM.

Since we throw exception via elog for our current columnstore handle, so it's easy to verify by simply telling whether our SQL is returning value or meeting error.

However, one command does leverage our TAM, that's ANALYZE, which gets system stats and stores in system catalog pg_statistic.
Reading the source code, relation size is fetched via TAM and used to calculate number of pages: https://github.com/postgres/postgres/blob/d3d0983169130a9b81e3fe48d5c2ca4931480956/src/backend/commands/analyze.c#L194; but it's not stored in the pg_statistic.

In short, we don't have a good way to verify the correctness for calculate_relation_size, and it's not useful.

I made a dummy implementation for necessary functions for ANALYZE, which simply doesn't fail the command but doesn't provide actual value.

pg_duckdb seems to do the same thing (https://github.com/duckdb/pg_duckdb/blob/main/src/pgduckdb_table_am.cpp), maybe people's using ANALYZE command in their workload, and they don't want failure.

@dpxcc
Copy link
Contributor Author

dpxcc commented Jan 20, 2025

calculate_relation_size() computes the total file sizes of <relationpath>, <relationpath>.1, <relationpath>.2 etc
Can we place our Parquet files in those expected locations so that calculate_relation_size() can work as intended?

@dentiny
Copy link
Contributor

dentiny commented Jan 21, 2025

Can we place our Parquet files in those expected locations so that calculate_relation_size() can work as intended?

Yes we can, two concerns here:

  • It only works for local filesystem; for remote storage it still doesn't work unless we mount (which usually depends on users)
  • Delta log / parquet data files structure would be more flat (i.e. all files under the same database are all placed under directory base/<db_iod>/<relation_oid>). Might be a concern for future list operations (but could be alleviated by batch write and compaction).

But it could have other benefits, it allows mooncake to stick more closely to postgres, there could be other features depending on data file locations.

@dpxcc
Copy link
Contributor Author

dpxcc commented Jan 21, 2025

Delta log / parquet data files structure would be more flat (i.e. all files under the same database are all placed under directory base/<db_iod>/<relation_oid>). Might be a concern for future list operations (but could be alleviated by batch write and compaction).

We can just put symbolic links there, the actual Parquet files can be arranged in whatever way we want
stat() will follow symbolic link and output the actual Parquet file size

@dpxcc
Copy link
Contributor Author

dpxcc commented Jan 21, 2025

I noticed that there is something called sparse file: https://en.wikipedia.org/wiki/Sparse_file
stat() will output whatever size we want to display, but internally they don't take any space

If we really want to make this work, we can just put dummy sparse file in the expected location for each Parquet file
(This works whether Parquet file is in local filesystem or remote storage)

It's a little bit hacky so I'm inclined to avoid doing this for now

@dpxcc dpxcc removed the good first issue Good for newcomers label Jan 21, 2025
@dpxcc dpxcc added this to the tbd milestone Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants