Real-World Benchmark

500,000 chunks · Wikipedia + arXiv + Project Gutenberg · NVIDIA A40 · Independently verifiable — download and run it yourself.

32×
vs float32 RAG
48×
vs HNSW index
96×
BGE-base 256-bit
1.000
Recall@10

Corpus & Methodology

Mixed real-world corpus across three domains — general knowledge, scientific literature, and long-form prose.

SourceDomainSizeDescription
Wikipedia (Simple English)General knowledge~100 MBEncyclopedia articles
arXiv papersScience / ML~40 MBCS & ML abstracts + intros
Project GutenbergLiterature~28 MBPublic domain books
TotalMixed~168 MB raw642,939 paragraphs → 500,000 chunks

Chunking: 400 words / chunk, 50-word overlap. Embedding: BAAI/bge-m3 (1024-dim) on NVIDIA A40. Recall: 1,000 queries vs exact cosine top-k ground truth.

Self-retrieval protocol — queries are perturbed corpus chunks. Optimistic for binary methods. See Caveats below.

Retrieval Accuracy

BGE-M3 · 1024-bit binary fingerprints

MetricNodeMind MIHGround Truth
Recall@10.9991.000
Recall@30.9991.000
Recall@51.0001.000
Recall@101.0001.000
Recall@201.0001.000
MRR@100.99921.000

BGE-base · 768-bit and 256-bit (PCA)

Metric768-bit256-bit (PCA)
Recall@10.9991.000
Recall@51.0001.000
Recall@101.0001.000
MRR@100.99951.000

Index Size — 500,000 Chunks

IndexSizeBytes / chunkvs float32
NodeMind BGE-M3 (1024-bit)64 MB128 B32×
Float32 RAG — BGE-M3 (baseline)2,048 MB4,096 B1× (reference)
HNSW index (float32 × 1.5× overhead)3,072 MB6,144 B48× vs NM
NodeMind BGE-base 256-bit (PCA)16 MB32 B96×

Index only — document text stored separately and equally in all systems.

Download — Verify It Yourself

All files generated from the same 500,000 chunks. Download NodeMind + float32 RAG side by side to verify compression ratios yourself.

nm_bgem3_index.pkl
64 MB
NodeMind BGE-M3 binary fingerprint index — 32× smaller than float32
Download
rag_bgem3_index.pkl
2,048 MB
Float32 RAG baseline — verify the 32× compression yourself
Download
hnsw_size_reference.txt
<1 KB
HNSW = float32 × 1.5× — formula and explanation for the 48× number
Download
nm_bgebase256_index.pkl
16 MB
NodeMind BGE-base 256-bit PCA — 96× smaller than float32
Download
corpus.pkl
~144 MB
500K text chunks — shared source for all indexes
Download
NodeMind_RealWorld_Benchmark.pdf
~1 MB
Full benchmark report with methodology, tables, and caveats
Download PDF

Verify compression in Python

# pip install sentence-transformers (only needed for query, not verification)
import pickle

with open("nm_bgem3_index.pkl", "rb") as f: nm  = pickle.load(f)
with open("rag_bgem3_index.pkl","rb") as f: rag = pickle.load(f)

nm_mb  = nm["fps"].nbytes         / 1e6   # → 64
rag_mb = rag["embeddings"].nbytes  / 1e6   # → 2048
ratio  = rag["embeddings"].nbytes // nm["fps"].nbytes   # → 32

print(f"NodeMind : {nm_mb:.0f} MB")
print(f"Float32  : {rag_mb:.0f} MB")
print(f"Ratio    : {ratio}×")

# BGE-base 256-bit (96×)
with open("nm_bgebase256_index.pkl","rb") as f: nm96 = pickle.load(f)
# nm96["fps"] shape: (500000, 32)  →  16 MB
# float32 baseline: 500000 × 768 × 4 = 1,536 MB  →  96×

Run a query

import numpy as np
from sentence_transformers import SentenceTransformer

model  = SentenceTransformer("BAAI/bge-m3")
fps    = nm["fps"]
with open("corpus.pkl","rb") as f: corpus = pickle.load(f)
chunks = corpus["chunks"]

POPCOUNT = np.array([bin(i).count('1') for i in range(256)], dtype=np.int32)

def query_nodemind(text, top_k=5):
    emb  = model.encode([text], normalize_embeddings=True)[0]
    q_fp = _binarise(emb, nm)  # binarisation uses index metadata (patent-protected)
    dists = POPCOUNT[np.bitwise_xor(fps, q_fp[np.newaxis, :])].sum(axis=1)
    top   = np.argsort(dists)[:top_k]
    return [(int(dists[i]), chunks[i][:120]) for i in top]

for dist, text in query_nodemind("What is quantum entanglement?"):
    print(f"  [{dist:4d}] {text}")

The _binarise function uses the metadata stored in the pkl file. The exact method is covered under AU 2026901656 — the index is self-contained without reading the patent.

How It Works

1. Embed

Text is chunked and embedded with a sentence model (BGE-M3 or BGE-base), producing a float32 vector per chunk.

2. Binarise

Each embedding is converted to a compact binary fingerprint using pre-computed index metadata. Integer-only — no GPU at query time. Method is patent-protected (AU 2026901656).

3. Index (MIH)

Binary fingerprints stored in a Multi-Index Hash structure. Query finds candidates by Hamming distance — pure integer arithmetic, any CPU. MIH structure: Norouzi et al. CVPR 2012. Novel contribution (AU 2026901657): CTV binarisation + portable single-file format.

4. Query

Embed query → binarise → Hamming search. Single .pkl file, no server, no Docker, no external DB.

Honest Caveats

Patents

AU 2026901656 — WHT Integer Codec: integer-only binarisation without learned projection.
AU 2026901657 — NodeMind Centroid MIH: CTV-based binary fingerprinting + MIH search.

Filed IP Australia, May 2026. Built in Coleambally, NSW, Australia.