Data Operations (CRUD)¶
Use mo_crud_kit for model-aware bulk create/read/update workflows with Polars frames.
This workflow is designed for data-heavy endpoints where serializer-by-row patterns become a bottleneck. It applies directly to high-volume ingestion and update APIs.
Prerequisites¶
- Models are migrated and uses UUID primary/foreign keys.
Implementation¶
mo_crud_kit provides model-aware operations for bulk create, read, and update.
It uses model metadata plus Polars validation to keep writes fast and consistent.
1. Model Frames¶
mo_crud_kit works on a model-frame mapping:
Each write operation returns:
status:ok,partial_ok, orfailvalid_model_frms: rows that passed validationinvalid_model_frms: rows with error metadata
Sample Model Frame
import polars as pl
from apps.orders.models import OrderModel
order_df = pl.DataFrame(
{
"order_id": ["f2fa1a5b7abf4f37a3f7e14725c0b211"],
"status": ["draft"],
"total_amount": [120.50],
}
)
model_frms = {
OrderModel: order_df,
}
2. Create¶
Create rows from model-to-frame mappings with optional validation pipeline.
Usage:
from django_mindoff.components.crud_kit import mo_crud_kit
status, valid_model_frms, invalid_model_frms = mo_crud_kit.create(
{
OrderModel: order_df,
OrderItemModel: order_item_df,
},
is_partial=False,
is_validate=True,
batch_size=1000,
)
Parameters:
model_frms(dict[type[models.Model], pl.DataFrame|pl.LazyFrame]): Input model-frame mapping for bulk insert.is_partial(bool, default=False): IfTrue, allows partial save when only a subset of rows are valid.is_validate(bool, default=True): IfTrue, runs column, row, and foreign-key validation before insert.batch_size(int, default=1000): Batch size used by the underlying write process.
Varieties:
- Validation mode:
is_validate=TruerunsColumnValidator -> RowValidator -> ForeignKeyValidator. - Partial-save mode:
is_partial=Falsefails if any invalid rows exist;is_partial=Truesaves valid rows and returns invalid rows separately.
Possible responses:
- Returns
("ok", valid_model_frms, {})when all rows are valid and inserted. - Returns
("partial_ok", valid_model_frms, invalid_model_frms)when partial mode is enabled and some rows are invalid. - Returns
("fail", valid_model_frms, invalid_model_frms)when validation fails and no write should proceed.
Notes:
- Invalid rows contain an error column (
POLARS_VALIDATOR_ERROR_COLor__error__info). is_validate=Falseskips safety checks and may persist unsafe data.
3. Read¶
Read queryset data into Polars with streaming/pagination variants.
Usage:
from django_mindoff.components.crud_kit import mo_crud_kit
frm, stats = mo_crud_kit.read(
OrderModel.objects.filter(is_active=True).values(),
page_number=1,
is_lazy=False,
batch_size=100,
)
Parameters:
qs(models.QuerySet): Queryset that must use.values()output.page_number(int|None, default=None): When provided, enables pagination mode. WhenNone, uses streaming mode.is_lazy(bool, default=False): IfTrue, returnspl.LazyFrame; otherwise returnspl.DataFrame.batch_size(int, default=0): Chunk/page size. Auto-resolved when0.
Varieties:
- Streaming mode (
page_number=None): reads full dataset in chunks. - Pagination mode (
page_number=<n>): reads one page and returns paging metadata. - Materialization mode: eager (
DataFrame) or lazy (LazyFrame).
Possible responses:
- Returns
(frm, stats)wherefrmisDataFrame/LazyFrameandstatsincludes:mode,batch_size,total_count,total_pages,current_page,has_next,has_previous. - Returns empty frame with zeroed stats for empty querysets.
- Raises validation error if queryset is not
.values()-based.
4. Update¶
Upsert rows from model-to-frame mappings with optional staged merge strategy.
Usage:
from django_mindoff.components.crud_kit import mo_crud_kit
status, valid_model_frms, invalid_model_frms = mo_crud_kit.update(
{
OrderModel: order_updates_df,
},
is_partial=True,
is_validate=True,
batch_size=1000,
is_temp_table=True,
)
Parameters:
model_frms(dict[type[models.Model], pl.DataFrame|pl.LazyFrame]): Input model-frame mapping for bulk update/upsert.is_partial(bool, default=False): IfTrue, allows valid rows to proceed even when invalid rows exist.is_validate(bool, default=True): IfTrue, applies model-based column/row/FK validation before update.batch_size(int, default=1000): Batch size used for missing-column fetch and update processing.is_temp_table(bool, default=True): IfTrue, writes to staging table then merges; ifFalse, performs direct dialect-specific upsert.
Varieties:
- Validation mode:
enabled (
is_validate=True) or skipped (is_validate=False). - Partial mode:
fail-fast (
is_partial=False) or partial success (is_partial=True). - Upsert mode:
staging merge (
is_temp_table=True) or direct upsert (is_temp_table=False).
Possible responses:
- Returns
("ok", valid_model_frms, {})when all rows are valid and updated. - Returns
("partial_ok", valid_model_frms, invalid_model_frms)when partial mode is enabled and some rows are invalid. - Returns
("fail", valid_model_frms, invalid_model_frms)when validation blocks update.
Notes:
- Missing DB columns are auto-fetched using primary key before validation.
- Invalid rows include model-aware error details in error column.
Example Usage¶
from django_mindoff.components.crud_kit import mo_crud_kit
from apps.orders.models import OrderModel
# CREATE
create_status, create_valid, create_invalid = mo_crud_kit.create(
{OrderModel: order_df},
is_validate=True,
is_partial=False,
)
# READ (queryset must use values())
orders_frm, stats = mo_crud_kit.read(
OrderModel.objects.filter(is_active=True).values(),
page_number=1,
batch_size=100,
)
# UPDATE
update_status, update_valid, update_invalid = mo_crud_kit.update(
{OrderModel: order_df},
is_validate=True,
is_partial=True,
is_temp_table=True,
)
Core Concepts¶
1. Polars Serialization¶
mo_crud_kit uses model metadata plus Polars validators (ColumnValidator, RowValidator, ForeignKeyValidator) to sanitize and validate rows before DB writes.
This is the intended replacement for serializer-driven bulk validation in data-heavy pipelines.
What this gives you:
- Type normalization aligned with Django field definitions.
- Constraint checks (required/nullability, choices, min/max, length, FK consistency).
- Structured invalid-row capture in the configured error column (
POLARS_VALIDATOR_ERROR_COL, default__error__info).
Validate before write
mo_crud_kit is built for validated tabular data. Keep request-level validation in API code before CRUD execution.
2. Limitations¶
ManyToManyFieldis not supported in row validation.BinaryFieldis not supported in row validation.- Any Django field not mapped in row validator dtype map is unsupported.
- Models must use UUID primary keys for CRUD validation flow.
- Primary key and foreign key fields are expected to define explicit
db_column. mo_crud_kit.read()requires queryset.values()input.mo_crud_kit.delete()is not currently exposed.
Troubleshooting¶
read()fails with shape/type errors
Confirm queryset input uses.values()and field names align with frame columns.- Rows are silently excluded from writes
Inspectinvalid_model_frmsand the configured error column to trace validation failures. - FK validation fails unexpectedly
Check UUID types and explicitdb_columnconfiguration on related fields.