mirror of
https://github.com/zebrajr/ArchiveBox.git
synced 2026-01-15 12:15:10 +00:00
Add CLAUDE.md with development and testing guide
This commit is contained in:
161
CLAUDE.md
Normal file
161
CLAUDE.md
Normal file
@@ -0,0 +1,161 @@
|
||||
# Claude Code Development Guide for ArchiveBox
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Set up dev environment
|
||||
uv sync --dev
|
||||
|
||||
# Run tests as non-root user (required - ArchiveBox refuses to run as root)
|
||||
sudo -u testuser bash -c 'source .venv/bin/activate && python -m pytest archivebox/tests/ -v'
|
||||
```
|
||||
|
||||
## Development Environment Setup
|
||||
|
||||
### Prerequisites
|
||||
- Python 3.11+ (3.13 recommended)
|
||||
- uv package manager
|
||||
- A non-root user for running tests (e.g., `testuser`)
|
||||
|
||||
### Install Dependencies
|
||||
```bash
|
||||
uv sync --dev
|
||||
```
|
||||
|
||||
### Activate Virtual Environment
|
||||
```bash
|
||||
source .venv/bin/activate
|
||||
```
|
||||
|
||||
## Running Tests
|
||||
|
||||
### CRITICAL: Never Run as Root
|
||||
ArchiveBox has a root check that prevents running as root user. Always run tests as a non-root user:
|
||||
|
||||
```bash
|
||||
# Run all migration tests
|
||||
sudo -u testuser bash -c 'source /path/to/.venv/bin/activate && python -m pytest archivebox/tests/test_migrations_*.py -v'
|
||||
|
||||
# Run specific test file
|
||||
sudo -u testuser bash -c 'source .venv/bin/activate && python -m pytest archivebox/tests/test_migrations_08_to_09.py -v'
|
||||
|
||||
# Run single test
|
||||
sudo -u testuser bash -c 'source .venv/bin/activate && python -m pytest archivebox/tests/test_migrations_fresh.py::TestFreshInstall::test_init_creates_database -xvs'
|
||||
```
|
||||
|
||||
### Test File Structure
|
||||
```
|
||||
archivebox/tests/
|
||||
├── test_migrations_helpers.py # Schemas, seeding functions, verification helpers
|
||||
├── test_migrations_fresh.py # Fresh install tests
|
||||
├── test_migrations_04_to_09.py # 0.4.x → 0.9.x migration tests
|
||||
├── test_migrations_07_to_09.py # 0.7.x → 0.9.x migration tests
|
||||
└── test_migrations_08_to_09.py # 0.8.x → 0.9.x migration tests
|
||||
```
|
||||
|
||||
## Test Writing Standards
|
||||
|
||||
### NO MOCKS - Real Tests Only
|
||||
Tests must exercise real code paths:
|
||||
- Create real SQLite databases with version-specific schemas
|
||||
- Seed with realistic test data
|
||||
- Run actual `python -m archivebox` commands via subprocess
|
||||
- Query SQLite directly to verify results
|
||||
|
||||
### NO SKIPS
|
||||
Never use `@skip`, `skipTest`, or `pytest.mark.skip`. Every test must run.
|
||||
|
||||
### Strict Assertions
|
||||
- `init` command must return exit code 0 (not `[0, 1]`)
|
||||
- Verify ALL data is preserved, not just "at least one"
|
||||
- Use exact counts (`==`) not loose bounds (`>=`)
|
||||
|
||||
### Example Test Pattern
|
||||
```python
|
||||
def test_migration_preserves_snapshots(self):
|
||||
"""Migration should preserve all snapshots."""
|
||||
result = run_archivebox(self.work_dir, ['init'], timeout=45)
|
||||
self.assertEqual(result.returncode, 0, f"Init failed: {result.stderr}")
|
||||
|
||||
ok, msg = verify_snapshot_count(self.db_path, expected_count)
|
||||
self.assertTrue(ok, msg)
|
||||
```
|
||||
|
||||
## Migration Testing
|
||||
|
||||
### Schema Versions
|
||||
- **0.4.x**: First Django version. Tags as comma-separated string, no ArchiveResult model
|
||||
- **0.7.x**: Tag model with M2M, ArchiveResult model, AutoField PKs
|
||||
- **0.8.x**: Crawl/Seed models, UUID PKs, status fields, depth/retry_at
|
||||
- **0.9.x**: Seed model removed, seed_id FK removed from Crawl
|
||||
|
||||
### Testing a Migration Path
|
||||
1. Create SQLite DB with source version schema (from `test_migrations_helpers.py`)
|
||||
2. Seed with realistic test data using `seed_0_X_data()`
|
||||
3. Run `archivebox init` to trigger migrations
|
||||
4. Verify data preservation with `verify_*` functions
|
||||
5. Test CLI commands work post-migration (`status`, `list`, `add`, etc.)
|
||||
|
||||
### Squashed Migrations
|
||||
When testing 0.8.x (dev branch), you must record ALL replaced migrations:
|
||||
```python
|
||||
# The squashed migration replaces these - all must be recorded
|
||||
('core', '0023_alter_archiveresult_options_archiveresult_abid_and_more'),
|
||||
('core', '0024_auto_20240513_1143'),
|
||||
# ... all 52 migrations from 0023-0074 ...
|
||||
('core', '0023_new_schema'), # Also record the squashed migration itself
|
||||
```
|
||||
|
||||
## Common Gotchas
|
||||
|
||||
### 1. File Permissions
|
||||
New files created by root need permissions fixed for testuser:
|
||||
```bash
|
||||
chmod 644 archivebox/tests/test_*.py
|
||||
```
|
||||
|
||||
### 2. DATA_DIR Environment Variable
|
||||
Tests use temp directories. The `run_archivebox()` helper sets `DATA_DIR` automatically.
|
||||
|
||||
### 3. Extractors Disabled for Speed
|
||||
Tests disable all extractors via environment variables for faster execution:
|
||||
```python
|
||||
env['SAVE_TITLE'] = 'False'
|
||||
env['SAVE_FAVICON'] = 'False'
|
||||
# ... etc
|
||||
```
|
||||
|
||||
### 4. Timeout Settings
|
||||
Use appropriate timeouts for migration tests (45s for init, 60s default).
|
||||
|
||||
### 5. Circular FK References in Schemas
|
||||
SQLite handles circular references with `IF NOT EXISTS`. Order matters less than in other DBs.
|
||||
|
||||
## Architecture Notes
|
||||
|
||||
### Crawl Model (0.9.x)
|
||||
- Crawl groups multiple Snapshots from a single `add` command
|
||||
- Each `add` creates one Crawl with one or more Snapshots
|
||||
- Seed model was removed - crawls now store URLs directly
|
||||
|
||||
### Migration Strategy
|
||||
- Squashed migrations for clean installs
|
||||
- Individual migrations recorded for upgrades from dev branch
|
||||
- `replaces` attribute in squashed migrations lists what they replace
|
||||
|
||||
## Debugging Tips
|
||||
|
||||
### Check Migration State
|
||||
```bash
|
||||
sqlite3 /path/to/index.sqlite3 "SELECT app, name FROM django_migrations WHERE app='core' ORDER BY id;"
|
||||
```
|
||||
|
||||
### Check Table Schema
|
||||
```bash
|
||||
sqlite3 /path/to/index.sqlite3 "PRAGMA table_info(core_snapshot);"
|
||||
```
|
||||
|
||||
### Verbose Test Output
|
||||
```bash
|
||||
sudo -u testuser bash -c 'source .venv/bin/activate && python -m pytest archivebox/tests/test_migrations_08_to_09.py -xvs 2>&1 | head -200'
|
||||
```
|
||||
Reference in New Issue
Block a user