Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/en/introduction/Architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ The shared-data architecture maintains as simple an architecture as its shared-n

#### Nodes

Coordinator nodes in the shared-data architecture provide the same functions as FEs in the shared-nothing architecture.
FEs in the shared-data architecture provide the same functions as in the shared-nothing architecture.

BEs are replaced with CNs (Compute Nodes), and the storage function is offloaded to object storage or HDFS. CNs are stateless compute nodes that perform all the functions of BEs, except for the storage of data.

Expand Down
79 changes: 79 additions & 0 deletions docs/en/introduction/what_is_starrocks.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
displayed_sidebar: docs
---

# What is StarRocks?

StarRocks is a next-generation, blazing-fast massively parallel processing (MPP) database designed to make real-time analytics easy for enterprises. It is built to power sub-second queries at scale.

StarRocks has an elegant design. It encompasses a rich set of features including fully vectorized engine, newly designed cost-based optimizer (CBO), and intelligent materialized view. As such, StarRocks can deliver a query speed far exceeding database products of its kind, especially for multi-table joins.

StarRocks is ideal for real-time analytics on fresh data. Data can be ingested at a high speed and updated and deleted in real time. StarRocks empowers users to create tables that use various schemas, such as flat, star, and snowflake schemas.

Compatible with MySQL protocols and standard SQL, StarRocks has out-of-the-box support for all major Business Intelligence (BI) tools, such as Tableau and Power BI. StarRocks does not rely on any external components. It is an integrated data analytics platform that allows for high scalability, high availability, and simplified management and maintenance.

[StarRocks](https://github.com/StarRocks/starrocks/tree/main) is licensed under Apache 2.0, available at the StarRocks GitHub repository (see the [StarRocks license](https://github.com/StarRocks/starrocks/blob/main/LICENSE.txt)). StarRocks (i) links to or calls functions from third party software libraries, the licenses of which are available in the folder [licenses-binary](https://github.com/StarRocks/starrocks/tree/main/licenses-binary); and (ii) incorporates third party software code, the licenses of which are available in the folder [licenses](https://github.com/StarRocks/starrocks/tree/main/licenses).

## Scenarios

StarRocks meets varied enterprise analytics requirements, including OLAP (Online Analytical Processing) multi-dimensional analytics, real-time analytics, high-concurrency analytics, customized reporting, ad-hoc queries, and unified analytics.

### OLAP multi-dimensional analytics

The MPP framework and vectorized execution engine enable users to choose between various schemas to develop multi-dimensional analytical reports. Scenarios:

- User behavior analysis

- User profiling, label analysis, user tagging

- High-dimensional metrics report

- Self-service dashboard

- Service anomaly probing and analysis

- Cross-theme analysis

- Financial data analysis

- System monitoring analysis

### Real-time analytics

StarRocks uses the Primary Key table to implement real-time updates. Data changes in a TP (Transaction Processing) database can be synchronized to StarRocks in a matter of seconds to build a real-time warehouse.

Scenarios:

- Online promotion analysis

- Logistics tracking and analysis

- Performance analysis and metrics computation for the financial industry

- Quality analysis for livestreaming

- Ad placement analysis

- Cockpit management

- Application Performance Management (APM)

### High-concurrency analytics

StarRocks leverages performant data distribution, flexible indexing, and intelligent materialized views to facilitate user-facing analytics at high concurrency:

- Advertiser report analysis

- Channel analysis for the retail industry

- User-facing analysis for SaaS

- Multi-tabbed dashboard analysis

### Unified analytics

StarRocks provides a unified data analytics experience.

- One system can power various analytical scenarios, reducing system complexity and lowering Total Cost of Ownership (TCO).

- StarRocks unifies data lakes and data warehouses. Data in a lakehouse can be managed all in StarRocks. Latency-sensitive queries that require high concurrency can run on StarRocks. Data in data lakes can be accessed by using external catalogs or external tables provided by StarRocks.
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ displayed_sidebar: docs

# dict_mapping



Returns the value mapped to the specified key in a dictionary table.

This function is mainly used to simplify the application of a global dictionary table. During data loading into a target table, StarRocks automatically obtains the value mapped to the specified key from the dictionary table by using the input parameters in this function, and then loads the value into the target table.
Expand All @@ -26,7 +28,7 @@ key_column_expr ::= <column_name> | <expr>
- `[<db_name>.]<dict_table>`: The name of the dictionary table, which needs to be a Primary Key table. The supported data type is VARCHAR.
- `key_column_expr_list`: The expression list for key columns in the dictionary table, including one or multiple `key_column_exprs`. The `key_column_expr` can be the name of a key column in the dictionary table, or a specific key or key expression.

This expression list needs to include all Primary Key columns of the dictionary table, which means the total number of expressions needs to match the total number of Primary Key columns in the dictionary table. So when the dictionary table uses composite primary key, the expressions in this list needs to correspond to the Primary Key columns defined in the table schema by sequence. Multiple expressions in this list are separated by commas (`,`). And if a `key_column_expr` is a specific key or key expression, its type must match the type of the corresponding Primary Key column in the dictionary table.
This expression list needs to include all Primary Key columns of the dictionary table, which means the total number of expressions needs to match the total number of Primary Key columns in the dictionary table. So when the dictionary table uses Composite Primary Key, the expressions in this list needs to correspond to the Primary Key columns defined in the table schema by sequence. Multiple expressions in this list are separated by commas (`,`). And if a `key_column_expr` is a specific key or key expression, its type must match the type of the corresponding Primary Key column in the dictionary table.

- Optional parameters:
- `<value_column>`: The name of the value column, which is also the mapping column. If the value column is not specified, the default value column is the AUTO_INCREMENT column of the dictionary table. The value column can also be defined as any column in the dictionary table excluding auto-incremented columns and primary keys. The column's data type has no restrictions.
Expand Down Expand Up @@ -172,15 +174,15 @@ ERROR 1064 (HY000): Query failed if record not exist in dict table.

**Example 5: If the dictionary table uses composite primary keys, all primary keys must be specified when querying.**

1. Create a dictionary table with composite primary keys and load simulated data into it.
1. Create a dictionary table with Composite Primary Keys and load simulated data into it.

```SQL
MySQL [test]> CREATE TABLE dict2 (
order_uuid STRING,
order_date DATE,
order_id_int BIGINT AUTO_INCREMENT
)
PRIMARY KEY (order_uuid,order_date) -- composite primary Key
PRIMARY KEY (order_uuid,order_date) -- Composite Primary Key
DISTRIBUTED BY HASH (order_uuid,order_date)
;
Query OK, 0 rows affected (0.02 sec)
Expand All @@ -201,7 +203,7 @@ ERROR 1064 (HY000): Query failed if record not exist in dict table.
3 rows in set (0.01 sec)
```

2. Query the value mapped to the key in the dictionary table. Because the dictionary table has composite primary keys, all primary keys need to be specified in `dict_mapping`.
2. Query the value mapped to the key in the dictionary table. Because the dictionary table has Composite Primary Keys, all primary keys need to be specified in `dict_mapping`.

```SQL
SELECT dict_mapping('dict2', 'a1', cast('2023-11-22' as DATE));
Expand Down
Loading