|
|
hace 3 semanas | |
|---|---|---|
| .. | ||
| lightrag_visualizer | hace 3 semanas | |
| README_CLEAN_LLM_QUERY_CACHE.md | hace 3 semanas | |
| README_MIGRATE_LLM_CACHE.md | hace 3 semanas | |
| __init__.py | hace 3 semanas | |
| check_initialization.py | hace 3 semanas | |
| clean_llm_query_cache.py | hace 3 semanas | |
| download_cache.py | hace 3 semanas | |
| hash_password.py | hace 3 semanas | |
| migrate_llm_cache.py | hace 3 semanas | |
| prepare_qdrant_legacy_data.py | hace 3 semanas | |
This tool cleans up LightRAG's LLM query cache from KV storage implementations. It specifically targets query caches generated during RAG query operations (modes: mix, hybrid, local, global), including both query and keywords caches.
The tool cleans up the following query cache types:
mix:* - Mixed mode query cacheshybrid:* - Hybrid mode query cacheslocal:* - Local mode query cachesglobal:* - Global mode query caches*:query:* - Query result caches*:keywords:* - Keywords extraction caches<mode>:<cache_type>:<hash>
Examples:
mix:query:5ce04d25e957c290216cee5bfe6344famix:keywords:fee77b98244a0b047ce95e21060de60eglobal:query:abc123def456...local:keywords:789xyz...Important Note: This tool does NOT clean extraction caches (default:extract:* and default:summary:*). Use the migration tool or manual deletion for those caches.
Run from the LightRAG project root directory:
python -m lightrag.tools.clean_llm_query_cache
# or
python lightrag/tools/clean_llm_query_cache.py
The tool guides you through the following steps:
============================================================
LLM Query Cache Cleanup Tool - LightRAG
============================================================
=== Storage Setup ===
Supported KV Storage Types:
[1] JsonKVStorage
[2] RedisKVStorage
[3] PGKVStorage
[4] MongoKVStorage
[5] OpenSearchKVStorage
Select storage type (1-5) (Press Enter to exit): 1
Note: You can press Enter or type 0 at any prompt to exit gracefully.
The tool will:
Verify connection status
Checking configuration...
โ All required environment variables are set
Initializing storage...
- Storage Type: JsonKVStorage
- Workspace: space1
- Connection Status: โ Success
The tool displays a detailed breakdown of query caches by mode and type:
Counting query cache records...
๐ Query Cache Statistics (Before Cleanup):
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโโ
โ Mode โ Query โ Keywords โ Total โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโค
โ mix โ 1,234 โ 567 โ 1,801 โ
โ hybrid โ 890 โ 423 โ 1,313 โ
โ local โ 2,345 โ 1,123 โ 3,468 โ
โ global โ 678 โ 345 โ 1,023 โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโค
โ Total โ 5,147 โ 2,458 โ 7,605 โ
โโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโโ
Choose what type of caches to delete:
=== Cleanup Options ===
[1] Delete all query caches (both query and keywords)
[2] Delete query caches only (keep keywords)
[3] Delete keywords caches only (keep query)
[0] Cancel
Select cleanup option (0-3): 1
Cleanup Types:
Review the cleanup plan and confirm:
============================================================
Cleanup Confirmation
============================================================
Storage: JsonKVStorage (workspace: space1)
Cleanup Type: all
Records to Delete: 7,605 / 7,605
โ ๏ธ WARNING: This will delete ALL query caches across all modes!
Continue with deletion? (y/n): y
The tool performs batch deletion with real-time progress:
JsonKVStorage Example:
=== Starting Cleanup ===
๐ก Processing 1,000 records at a time from JsonKVStorage
Batch 1/8: โโโโโโโโโโโโโโโโโโโโ 1,000/7,605 (13.1%) โ
Batch 2/8: โโโโโโโโโโโโโโโโโโโโ 2,000/7,605 (26.3%) โ
...
Batch 8/8: โโโโโโโโโโโโโโโโโโโโ 7,605/7,605 (100.0%) โ
Persisting changes to storage...
โ Changes persisted successfully
RedisKVStorage Example:
=== Starting Cleanup ===
๐ก Processing Redis keys in batches of 1,000
Batch 1: Deleted 1,000 keys (Total: 1,000) โ
Batch 2: Deleted 1,000 keys (Total: 2,000) โ
...
PostgreSQL Example:
=== Starting Cleanup ===
๐ก Executing PostgreSQL DELETE query
โ Deleted 7,605 records in 0.45s
MongoDB Example:
=== Starting Cleanup ===
๐ก Executing MongoDB deleteMany operations
Pattern 1/8: Deleted 1,234 records โ
Pattern 2/8: Deleted 567 records โ
...
Total deleted: 7,605 records
OpenSearchKVStorage Example:
=== Starting Cleanup ===
๐ก Processing 1,000 records at a time from OpenSearchKVStorage
Batch 1/8: โโโโโโโโโโโโโโโโโโโโ 1,000/7,605 (13.1%) โ
Batch 2/8: โโโโโโโโโโโโโโโโโโโโ 2,000/7,605 (26.3%) โ
...
The tool provides a comprehensive final report:
Successful Cleanup:
============================================================
Cleanup Complete - Final Report
============================================================
๐ Statistics:
Total records to delete: 7,605
Total batches: 8
Successful batches: 8
Failed batches: 0
Successfully deleted: 7,605
Failed to delete: 0
Success rate: 100.00%
๐ Before/After Comparison:
Total caches before: 7,605
Total caches after: 0
Net reduction: 7,605
============================================================
โ SUCCESS: All records cleaned up successfully!
============================================================
๐ Query Cache Statistics (After Cleanup):
โโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโโ
โ Mode โ Query โ Keywords โ Total โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโค
โ mix โ 0 โ 0 โ 0 โ
โ hybrid โ 0 โ 0 โ 0 โ
โ local โ 0 โ 0 โ 0 โ
โ global โ 0 โ 0 โ 0 โ
โโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโผโโโโโโโโโโโโโค
โ Total โ 0 โ 0 โ 0 โ
โโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโโดโโโโโโโโโโโโโ
Cleanup with Errors:
============================================================
Cleanup Complete - Final Report
============================================================
๐ Statistics:
Total records to delete: 7,605
Total batches: 8
Successful batches: 7
Failed batches: 1
Successfully deleted: 6,605
Failed to delete: 1,000
Success rate: 86.85%
๐ Before/After Comparison:
Total caches before: 7,605
Total caches after: 1,000
Net reduction: 6,605
โ ๏ธ Errors encountered: 1
Error Details:
------------------------------------------------------------
Error Summary:
- ConnectionError: 1 occurrence(s)
First 5 errors:
1. Batch 3
Type: ConnectionError
Message: Connection timeout after 30s
Records lost: 1,000
============================================================
โ ๏ธ WARNING: Cleanup completed with errors!
Please review the error details above.
============================================================
The tool retrieves workspace in the following priority order:
Storage-specific workspace environment variables
POSTGRES_WORKSPACEMONGODB_WORKSPACEREDIS_WORKSPACEOPENSEARCH_WORKSPACEGeneric workspace environment variable
WORKSPACEDefault value
JsonKVStorage:
# Direct key prefix matching
if key.startswith("mix:query:") or key.startswith("mix:keywords:")
RedisKVStorage:
# SCAN with namespace-prefixed patterns
pattern = f"{namespace}:mix:query:*"
cursor, keys = await redis.scan(cursor, match=pattern)
PostgreSQL:
# SQL LIKE conditions
WHERE id LIKE 'mix:query:%' OR id LIKE 'mix:keywords:%'
MongoDB:
# Regex queries on _id field
{"_id": {"$regex": "^mix:query:"}}
OpenSearchKVStorage:
# Scan raw hits, then match cache key prefixes in Python
if hit["_id"].startswith("mix:query:"):
The tool implements comprehensive error tracking:
After cleanup completes, a detailed report includes:
Irreversible Operation
Performance Impact
Selective Cleanup
Workspace Isolation
Interrupt and Resume
The tool supports multiple configuration methods with the following priority:
Configure storage settings in your .env file:
# Generic workspace (shared by all storages)
WORKSPACE=space1
# Or configure independent workspace for specific storage
POSTGRES_WORKSPACE=pg_space
MONGODB_WORKSPACE=mongo_space
REDIS_WORKSPACE=redis_space
Workspace Priority: Storage-specific > Generic WORKSPACE > Empty string
WORKING_DIR=./rag_storage
REDIS_URI=redis://localhost:6379
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_USER=your_username
POSTGRES_PASSWORD=your_password
POSTGRES_DATABASE=your_database
MONGO_URI=mongodb://root:root@localhost:27017/
MONGO_DATABASE=LightRAG
OPENSEARCH_HOSTS=localhost:9200
OPENSEARCH_WORKSPACE=search_space
If environment variables are not provided, the tool falls back to built-in defaults where available.
โ ๏ธ Warning: Missing environment variables: POSTGRES_USER, POSTGRES_PASSWORD
Solution: Add missing variables to your .env file
โ Initialization failed: Connection refused
Solutions:
โ ๏ธ No query caches found in storage
Possible Reasons:
โ ๏ธ WARNING: Cleanup completed with errors!
Solutions:
Scenario: Free up storage space by removing all query caches
# Run tool
python -m lightrag.tools.clean_llm_query_cache
# Select: Storage type -> Option 1 (all) -> Confirm (y)
Result: All query and keywords caches deleted, maximum storage freed
Scenario: Force query cache rebuild while keeping keywords
# Run tool
python -m lightrag.tools.clean_llm_query_cache
# Select: Storage type -> Option 2 (query only) -> Confirm (y)
Result: Query caches deleted, keywords preserved for faster rebuild
Scenario: Remove outdated keywords while keeping recent query results
# Run tool
python -m lightrag.tools.clean_llm_query_cache
# Select: Storage type -> Option 3 (keywords only) -> Confirm (y)
Result: Keywords deleted, query caches preserved
Scenario: Clean caches for a specific workspace
# Configure workspace
export WORKSPACE=development
# Run tool
python -m lightrag.tools.clean_llm_query_cache
# Select: Storage type -> Cleanup option -> Confirm (y)
Result: Only development workspace caches cleaned
Backup Before Cleanup
Monitor Performance
Scheduled Cleanup
Selective Deletion
Storage Capacity
| Feature | Cleanup Tool | Migration Tool |
|---|---|---|
| Purpose | Delete query caches | Migrate extraction caches |
| Cache Types | mix/hybrid/local/global | default:extract/summary |
| Modes | query, keywords | extract, summary |
| Operation | Deletion | Copy between storages |
| Reversible | No | Yes (source unchanged) |
| Use Case | Free storage, refresh caches | Change storage backend |
Single Storage Operation
No Dry Run Mode
No Selective Mode Cleanup
mix)No Scheduled Cleanup
Verification Limitations
Potential improvements for future versions:
mix mode)For issues, questions, or feature requests: