Skip to content

Commit

Permalink
Update docs (#16)
Browse files Browse the repository at this point in the history
* add docs for index paramters

* add operation config docs

* add flow control docs
  • Loading branch information
jjh-kim committed Jul 26, 2024
1 parent 3f82078 commit 791f347
Show file tree
Hide file tree
Showing 3 changed files with 55 additions and 0 deletions.
10 changes: 10 additions & 0 deletions doc/add-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,13 @@ Before `ALTER TABLE ... ADD INDEX` command finishes, index is temporarily create
### Free Memory (Enhanced Monitoring)
Upon creating an index, the Free Memory as reported by Enhanced Monitoring will decrease. This decrease continues rapidly until it reaches a certain value. However, Aurora has the capability to immediately reclaim memory from FreeableMemory (as observed in CloudWatch), so this should not pose a significant issue. Nonetheless, it is important to monitor and ensure that neither Free Memory nor Freeable Memory reaches zero.

### Innodb Parameters (MySQL 8.0.27 and above)
In MySQL 8.0.27 new innodb parameters `innodb_ddl_buffer_size`, `innodb_ddl_threads`, and `innodb_parallel_read_threads` were added to improve secondary index creation.
SB-OSC supports options to set these parameters in the migration configuration before creating indexes.
```yaml
innodb_ddl_buffer_size: 1048576
innodb_ddl_threads: 4
innodb_parallel_read_threads: 4
```
Please refer to the [MySQL documentation](https://dev.mysql.com/doc/refman/8.0/en/innodb-parameters.html) for more information on these parameters.
10 changes: 10 additions & 0 deletions doc/config.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
# Config

## Flow Control
SB-OSC allows user to control the flow of the migration process by setting various parameters. You can set these parameters to apply specific stages based on your environment and requirements.

### skip_bulk_import
If you set this parameter to `True`, SB-OSC will skip the bulk import stage and start from the apply DML events stage. This is useful when you have already copied the data to the destination table and only need to apply DML events. For example, when you create a clone cluster to make an initial copy and replicate changes using SB-OSC, this parameter can be set to `True`. `init_binlog_file` and `init_binlog_position` should be also set when `skip_bulk_import` is `True`, otherwise it will raise an error.

### disable_apply_dml_events
If you set this parameter to `True`, SB-OSC will pause before `apply_dml_events` stage. This is useful when you have additional steps to perform manually before applying DML events.


## Chunk
### max_chunk_count & min_chunk_size
SB-OSC calculates the number of chunks to create based on following formula
Expand Down
35 changes: 35 additions & 0 deletions doc/operation-class.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,13 @@ SB-OSC provides two default operation classes. `BaseOperation` is the default op

You can create your own operation class by inheriting `BaseOperation` and overriding its methods. If you pass the operation class name to the `operation_class` parameter in the migration configuration, SB-OSC detect any operation class defined below `src/sbosc/opeartion` directory and use it for the migration process.

You can also add additional configs dedicated to the operation class. These configs will be passed to the operation class as `operation_config` wrapped in dataclass you defined.

```yaml
operation_class_config:
retention_days: 30
```
## Example
### BaseOperation
Expand Down Expand Up @@ -35,6 +42,8 @@ class MessageRetentionOperation(BaseOperation):
### CrossClusterOperation
```python
from sbosc.operations.base import CrossClusterBaseOperation
class CrossClusterMessageRetentionOperation(CrossClusterBaseOperation):
def _select_batch_query(self, start_pk, end_pk):
return f'''
Expand All @@ -58,3 +67,29 @@ class CrossClusterMessageRetentionOperation(CrossClusterBaseOperation):
dest_pks = [row[0] for row in dest_cursor.fetchall()]
return list(set(source_pks) - set(dest_pks))
```
### Operation Config
```python
from dataclasses import dataclass
from sbosc.operations.base import BaseOperation
from sbosc.operations.operation import MigrationOperationConfig
@dataclass
class MessageRetentionConfig(MigrationOperationConfig):
retention_days: int
class MessageRetentionOperation(BaseOperation):
operation_config_class = MessageRetentionConfig
operation_config: MessageRetentionConfig
def _insert_batch_query(self, start_pk, end_pk):
return f"""
INSERT INTO {self.source_db}.{self.destination_table}({self.source_columns})
SELECT {self.source_columns}
FROM {self.source_db}.{self.source_table} AS source
WHERE source.id BETWEEN {start_pk} AND {end_pk}
AND source.ts > DATE_SUB(NOW(), INTERVAL {self.operation_config.retention_days} DAY)
"""
```

0 comments on commit 791f347

Please sign in to comment.