Quick Start¶
Get up and running with TracePipe in 5 minutes.
1. Enable Tracking¶
import tracepipe as tp
import pandas as pd
# Start tracking with debug mode for full lineage
tp.enable(mode="debug", watch=["price", "quantity"])
The watch parameter specifies which columns to track for cell-level changes.
2. Run Your Pipeline¶
Write your pandas code as usual — TracePipe instruments it automatically:
df = pd.DataFrame({
"product": ["A", "B", "C", "D"],
"price": [10.0, None, 30.0, 40.0],
"quantity": [5, 10, 0, 8]
})
df = df.dropna() # Drops row B (null price)
df = df[df["quantity"] > 0] # Drops row C (zero quantity)
df["total"] = df["price"] * df["quantity"]
3. Check Pipeline Health¶
Output:
TracePipe Check: [OK] Pipeline healthy
Mode: debug
Retention: 50%
Dropped: 2 rows
• DataFrame.dropna: 1
• DataFrame.__getitem__[mask]: 1
Value changes: 2 cells
• DataFrame.__setitem__[total]: 2
The CheckResult object provides convenient properties:
result.passed # True/False
result.retention # 0.5 (row retention rate)
result.n_dropped # 2 (total dropped rows)
result.drops_by_op # {"DataFrame.dropna": 1, ...}
result.n_changes # 2 (cell changes, debug mode only)
result.changes_by_op # {"DataFrame.__setitem__[total]": 2}
4. Trace a Row's Journey¶
Output:
5. Explain a Cell's Value¶
Output:
Cell History: row 0, column 'total'
Current value: 50.0
History (1 change):
None -> 50.0
by: DataFrame.__setitem__[total]
6. Generate a Report¶
This creates an interactive HTML report with:
- Pipeline flow diagram
- Retention funnel visualization
- Dropped rows by operation
- Cell change history
7. Clean Up¶
Next Steps¶
- Learn about CI vs Debug modes
- Explore health checks in depth
- Set up data contracts
- See real-world examples