Skip to content

AdaWorldAPI/ladybugdb

Repository files navigation

🐞 LadybugDB

Python 3.9+ License: Apache 2.0 Code style: black PRs Welcome

Unified cognitive substrate: SQL + Cypher + Vector + Hamming over LanceDB.

One database. All operations. Zero copies.


Why LadybugDB?

Modern AI systems need four query types:

Need Traditional LadybugDB
Analytics PostgreSQL βœ… DuckDB SQL
Graphs Neo4j βœ… Cypher β†’ Recursive CTEs
Vectors Pinecone βœ… LanceDB ANN
Fingerprints Custom code βœ… AVX-512 SIMD

That's 4 databases, 4 sync mechanisms, 4 points of failure.

LadybugDB collapses them into one. Zero-copy operations. 65M Hamming comparisons/sec. Familiar APIs.


Installation

pip install ladybugdb

With extras:

pip install ladybugdb[all]       # Everything
pip install ladybugdb[numba]     # SIMD acceleration
pip install ladybugdb[jina]      # Jina embeddings

From source:

git clone https://github.com/AdaWorldAPI/ladybugdb.git
cd ladybugdb && pip install -e ".[dev]"

Quick Start

Connect and Query

from ladybugdb import connect

db = connect("./mydb")

# SQL
db.sql("SELECT * FROM nodes WHERE label = 'Thought'")

# Cypher (auto-transpiled)
db.cypher("MATCH (a)-[:CAUSES*1..5]->(b) RETURN b")

# Resonance (Hamming similarity)
db.resonate(fingerprint, threshold=0.6)

# Vector search
db.vector_search(embedding, k=10)

Jina-Compatible API

from ladybugdb.compat import JinaClient, Document

client = JinaClient()
client.index([
    Document(id="1", content="Quantum entanglement"),
    Document(id="2", content="Classical mechanics"),
])
results = client.search("quantum", top_k=10)

Neo4j-Compatible API

from ladybugdb.compat import GraphDatabase

driver = GraphDatabase.driver("ladybug://./mydb")
with driver.session() as session:
    result = session.run("""
        MATCH (a:Config)-[:CAUSES*1..5]->(b:Failure)
        RETURN b, length(path) as depth
    """)

DTOs (Pydantic-Style)

from ladybugdb.compat import Node, Thought, Handover

node = Node(id="t1", content="Sky is blue", qidx=180)
node.fingerprint  # Auto-computed 10K bits

handover = Handover(
    from_agent="Archaeologist",
    to_agent="Developer", 
    task="Fix N+1 query"
)
print(handover.to_markdown())  # For LLM context

Core Features

πŸ” Unified Query Engine

One interface, all query types:

# SQL analytics
db.query("SELECT label, COUNT(*) FROM nodes GROUP BY label")

# Graph traversal  
db.query("MATCH path = (a)-[*1..10]->(b) RETURN path")

# Semantic search
db.query("VECTOR_SEARCH(embedding, 10)")

# Resonance matching
db.query("RESONATE(fingerprint, 0.6)")

⚑ Zero-Overhead Hamming

from ladybugdb.core import HammingEngine

engine = HammingEngine()
engine.index(corpus)  # Index once

# Search: 65M comparisons/sec, zero allocation
result = engine.search(query, k=10)

Performance:

Corpus Time Throughput
10K 150ΞΌs 65M/sec
100K 1.5ms 65M/sec
1M 15ms 65M/sec

πŸ¦‹ Butterfly Detection

Track causal amplification chains:

butterflies = db.detect_butterflies(
    source="config_change",
    threshold=2.0,  # amplification > 2x
    max_depth=10
)

for path, amp in butterflies:
    print(f"{amp:.1f}x: {' β†’ '.join(path)}")

πŸ—œοΈ Automatic Compression

from ladybugdb.compat import Compressor

compressor = Compressor()
block = compressor.compress(data)
# Auto-selects: Dictionary, RLE, FOR, Delta, or Bitpack
print(f"{block.compression_ratio:.0f}x smaller")

Documentation

Document Description
Quickstart 5-minute getting started guide
API Reference Complete API documentation
Architecture System internals
Compatibility Jina / Neo4j / Pydantic APIs
Performance Benchmarks and tuning
Migration From Neo4j, Pinecone, etc.

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Unified Query Interface                 β”‚
β”‚              SQL β”‚ Cypher β”‚ Vector β”‚ Hamming            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Cypher Transpiler β”‚ Resonance Engine β”‚ Vector Index     β”‚
β”‚ (Recursive CTEs)  β”‚ (AVX-512 SIMD)   β”‚ (HNSW/IVF)      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                    DuckDB SQL Engine                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                   LanceDB Storage Layer                  β”‚
β”‚            (Lance Format + BtrBlocks Compression)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Requirements

  • Python 3.9+
  • NumPy 1.20+
  • DuckDB 0.9+
  • LanceDB 0.3+ (optional)
  • Numba 0.58+ (optional, for SIMD)

Contributing

See CONTRIBUTING.md for guidelines.

git clone https://github.com/AdaWorldAPI/ladybugdb.git
cd ladybugdb
pip install -e ".[dev]"
pytest tests/

License

Apache License 2.0 β€” see LICENSE


Acknowledgments

Built on: LanceDB, DuckDB, Numba

Inspired by: BtrBlocks, Procella


One database. All operations. Zero copies.

About

🐞 Unified cognitive substrate: SQL + Cypher + Vector + Hamming over LanceDB. One database. All operations. Zero copies.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages