Database Migration
This page is about moving data between two different databases — for example, copying everything from a MySQL database into a fresh YugabyteDB one. That's a different problem from the Migrations guide, which is about keeping one database's schema in sync with your Python model as it changes over time. If you're looking to add a column to an existing table, see that page instead — this one is for "move this whole database somewhere else."
mydborm.migrate handles that move end-to-end: it reads the source
database's schema, converts each column to the equivalent type on the
target database, creates the matching tables (CREATE TABLE /
CREATE INDEX — the SQL statements that define a table's structure,
often shortened to "DDL"), copies every row over in manageable
batches, and double-checks the row counts match when it's done.
Supported source → target pairs:
- MySQL → YugabyteDB / PostgreSQL
- YugabyteDB / PostgreSQL → MySQL
- YugabyteDB ↔ PostgreSQL (same type system — no mapping needed)
Python API
from mydborm import MigrationEngine
engine = MigrationEngine(
source={"dialect": "mysql", "host": "127.0.0.1", "port": 3306,
"user": "root", "password": "<password>", "database": "shop"},
target={"dialect": "yugabyte", "host": "127.0.0.1", "port": 5433,
"user": "yugabyte", "password": "<password>", "database": "shop"},
)
result = engine.run(
tables = None, # None = every table, or ["users", "orders"]
chunk_size = 500,
overwrite = False, # skip tables that already have rows in the target
on_progress = lambda table, done, total: print(table, done, total),
verify = True, # compare row counts after the copy
)
print(result.summary())
assert result.is_success()
MigrationEngine opens the source and target connections independently —
both stay open for the duration of the migration and are always closed in a
finally block, even on error.
MigrationResult
result.tables_migrated # tables successfully copied
result.tables_failed # tables that raised an error
result.total_rows # rows counted in the source
result.rows_transferred # rows actually written to the target
result.duration # seconds
result.errors # ["orders: <exception text>", ...]
result.warnings # ["Skipped 'products' — target already has data", ...]
result.is_success() # True if tables_failed == 0 and no errors
result.summary() # human-readable report
Migrating by model class
MigrationEngine builds the target schema from whatever is live in the
source database. If you already have BaseModel subclasses and want the
model's own field definitions to be the source of truth for the target
table instead — typed columns, NOT NULL, defaults, and all — use
ObjectMigrator:
from mydborm import db, ObjectMigrator
from myapp.models import User, Order
source_db = db # or any configured ConnectionManager
target_db = db.__class__()
target_db.configure(dialect="yugabyte", host="127.0.0.1", port=5433,
user="yugabyte", password="<password>", database="shop")
migrator = ObjectMigrator(source_db, target_db, chunk_size=500)
result = migrator.migrate_model(User)
# {"table": "users", "rows_total": 12450, "rows_transferred": 12450, "skipped": False}
results = migrator.migrate_models([User, Order], overwrite=True)
# one model failing (e.g. a missing source table) doesn't stop the rest —
# failed entries come back as {"table": ..., "error": "..."}
ObjectMigrator doesn't own source_db/target_db — configure and close
them yourself, the same way you would for MigrationEngine. Row transfer
reuses the same chunked, retrying DataTransfer used internally by
MigrationEngine.
Dry run
dry_run() extracts the schema and generates DDL without opening a write
connection to the target — nothing is created or copied:
report = engine.dry_run(tables=["users", "orders"])
for t in report["tables"]:
print(t["table"], t["rows"], t["columns"])
print(t["create_table_sql"])
print(t["create_index_sql"])
print(report["warnings"]) # unmapped column types that fell back to TEXT
CLI usage
mydborm migrate-db \
--source-dialect mysql --source-host 127.0.0.1 --source-port 3306 \
--source-user root --source-password root --source-db shop \
--target-dialect yugabyte --target-host 127.0.0.1 --target-port 5433 \
--target-user yugabyte --target-password yugabyte --target-db shop
Preview without writing anything:
Migrate specific tables only, with a larger chunk size, replacing any existing rows in the target:
A run prints a status table as each table finishes, then a final summary:
Migrating shop (mysql) → shop (yugabyte)
✔ users — 12,450 row(s) transferred
✔ orders — 89,231 row(s) transferred
Table │ Rows │ Status
──────────────────────────────
users │ 12,450 │ ✔ Done
orders │ 89,231 │ ✔ Done
Total │ 101,681│ 2/2 done
Status : SUCCESS
Tables migrated : 2
Tables failed : 0
Total rows : 101681
Rows transferred : 101681
Duration : 4.2s
Handling large tables
Rows are read from the source in batches of chunk_size (500 by
default) and written to the target one batch at a time — the full
table is never loaded into memory at once, so this works the same way
whether a table has a hundred rows or a hundred million. If a batch
fails for a transient reason (a dropped connection, a brief lock), it
automatically retries up to 3 times before giving up on that table.
- Raise
chunk_size(e.g.2000–5000) for large tables with small rows to cut round-trips. - Lower it for tables with very large rows (big
TEXT/BLOB/JSONcolumns) to keep batches within driver/packet limits.
Existing data in the target
By default, a table that already has rows in the target is skipped — the table and its indexes are still created if missing, but no rows are copied, and a warning is recorded:
Pass overwrite=True (or --overwrite on the CLI) to delete the existing
rows in the target table before copying.
Verification
When verify=True (the default), every successfully migrated table is
checked after the copy: does it exist in the target, and does its row
count match the source? Mismatches are appended to result.warnings:
Table 'orders' is missing in the target database
Row count mismatch for 'products': source=1204 target=1198
Type mapping
Unmapped source types fall back to TEXT in the target, with a warning
recorded in result.warnings (or report["warnings"] for a dry run).
MySQL → YugabyteDB / PostgreSQL
| Source (MySQL) | Target |
|---|---|
| INT, INTEGER | INTEGER |
| TINYINT(1) | BOOLEAN |
| TINYINT | SMALLINT |
| SMALLINT | SMALLINT |
| BIGINT | BIGINT |
| BIGINT UNSIGNED | NUMERIC(20) |
| FLOAT | FLOAT |
| DOUBLE | DOUBLE PRECISION |
| DECIMAL(p,s) | DECIMAL(p,s) |
| VARCHAR(n) | VARCHAR(n) |
| CHAR(n) | CHAR(n) |
| TEXT / TINYTEXT / MEDIUMTEXT / LONGTEXT | TEXT |
| BLOB / MEDIUMBLOB / LONGBLOB | BYTEA |
| BINARY(n), VARBINARY(n) | BYTEA |
| DATE | DATE |
| DATETIME | TIMESTAMP |
| TIMESTAMP | TIMESTAMPTZ |
| TIME | TIME |
| JSON | JSONB |
| ENUM(...) | VARCHAR(255) |
| SET(...) | TEXT |
YugabyteDB / PostgreSQL → MySQL
| Source | Target (MySQL) |
|---|---|
| INTEGER, SERIAL | INT |
| BOOLEAN | TINYINT(1) |
| SMALLINT | SMALLINT |
| BIGINT | BIGINT |
| NUMERIC(20) | BIGINT UNSIGNED |
| FLOAT | FLOAT |
| DOUBLE PRECISION | DOUBLE |
| DECIMAL(p,s) / NUMERIC(p,s) | DECIMAL(p,s) |
| VARCHAR(n) | VARCHAR(n) |
| CHAR(n) | CHAR(n) |
| TEXT | TEXT |
| BYTEA | BLOB |
| DATE | DATE |
| TIMESTAMP | DATETIME |
| TIMESTAMPTZ | DATETIME |
| TIME | TIME |
| JSONB / JSON | JSON |