NodeMind — Binary Document Intelligence

Performance

RAG float32 vs NodeMind Binary

A 1 GB text document becomes a 10 GB RAG float32 index — that's the real cost of vector search at scale. NodeMind's binary codec crushes that 10 GB down to just 210 MB online (or 32× smaller offline). Same documents. Same BGE-M3 embeddings. Dramatically different storage.

Why does RAG expand 10×? Chunking 1 KB of text produces a 1024-dim float32 vector = 4 KB (4× on raw text). HNSW graph index structures add another 2–3×. Result: every 1 GB of documents becomes ~10 GB in a vector database — confirmed by Elasticsearch, Pure Storage, and Milvus benchmarks. NodeMind then compresses that 10 GB RAG index 48× further on text (up to 128× on multimodal data with NM-256) using our patent-pending binary codec.

Original Documents	RAG Index float32 · ~10× expansion	NodeMind Index binary · 48× smaller online	vs RAG	RAG Storage/mo S3 Standard	NodeMind Storage/mo S3 Standard	Managed Vector DB/mo Pinecone pricing	Annual Savings
— Storage Comparison
1 GB documents ~250K chunks	10 GB	210 MB	48×	$0.23/mo	$0.0024/mo	$25.00/mo	$300 / yr
10 GB documents ~2.5M chunks	100 GB	2.1 GB	48×	$2.30/mo	$0.024/mo	$250.00/mo	$3,000 / yr
100 GB documents ~25M chunks	1 TB	21 GB	48×	$23.00/mo	$0.24/mo	$2,500/mo	$30,000 / yr
1 TB documents ~250M chunks	10 TB	210 GB	48×	$230/mo	$2.40/mo	$25,000/mo	$300,000 / yr
— Search Performance
Search method Same 1024-dim BGE-M3	Cosine similarity on float32 — O(N·D) multiply-accumulate		<1ms	Hamming distance on 1024-bit integers — POPCNT only
GPU required	Yes — needed for fast cosine at scale			No — pure CPU, any machine
RAM for 250M chunks	~1 TB RAM			~10 GB RAM
Offline / portable	No — requires live vector DB connection			Yes — download zip, run anywhere, no cloud needed

Codec: NodeMind's compression is not standard binary quantization (which breaks down on out-of-distribution queries). Our patent-pending algorithm is integer-only, deterministic, and produces fingerprints with recall that beats fixed-threshold binary baselines on real BEIR queries. This achieves 32× compression with BGE-M3 (1024-bit), 48× vs HNSW (incl. ~50% graph overhead), 96× with BGE-base (256-bit), and up to 128× on multimodal data (NM-256). Costs use S3 Standard at $0.023/GB/mo vs Pinecone managed vector DB at $2.50/GB/mo.

Note on benchmarks. Compression ratios are mathematical and verifiable with os.path.getsize() on the downloadable indexes — see the interactive benchmark page. Sub-1ms search latency holds at small/medium N; sub-linear scaling to ~12ms is documented in the patent for 100M-chunk indexes. On real out-of-distribution BEIR queries, NodeMind beats standard FAISS Fixed Binary on 3 of 4 datasets at the same compression, and stays within ~5pp of float32 cosine — the 32× / 48× / 96× / 128× compression numbers are the trade you make for that gap.

Modalities

Built for every data type

The NodeMind codec is modality-agnostic — text, images, audio, tables, and code share the same patent-pending binary encoding. Every modality below is measured in the multimodal benchmark.

Text & Documents

32×–96×

PDF, TXT, Markdown. BGE-M3 → 1024-bit (32×); BGE-base + PCA-256 → 256-bit (96×). Recall@5 ≥ 0.999 on 500K-chunk Wikipedia/arXiv/Gutenberg corpus. Live on nodemind.space.

Images

32×–128×

Real Unsplash photos embedded with BGE-Visualized-M3. NM-1024 (32×) / NM-512 (64×) / NM-256 (128×) all hit Recall@1 = 1.000 vs Gemini RAG float32 baseline.

Audio

32×–128×

ESC-10 environmental clips (CC-BY 4.0) routed through Whisper transcription, then binarised with the same codec. Recall@1 = 1.000 across all three compression levels.

Tables & Code

32×–128×

Structured CSV / SQL / OHLCV tables and Python / SQL / Bash code. Same codec, same compression sweep, Recall@1 = 1.000. Video pipeline (transcript + frame embeddings) coming next.

All modality ratios are measured on real files — images are real Unsplash photos, audio is real ESC-10 environmental WAV files, tables and code are real structured data. See the multimodal benchmark for methodology, queries, and download links.

Algorithm

How NodeMind works

Three stages — embedding, binary encoding with our proprietary codec, and Multi-Index Hashing search. No gradients. No GPU. Pure integer arithmetic throughout.

1. Chunk document

→

2. BGE-M3 embed (1024-dim)

→

3. NodeMind binary codec

→

4. MIH index (64 sub-tables)

→

5. Hamming search → results

Patent-pending Binary Codec

Each float32 embedding is converted to a compact binary fingerprint using our patent-pending integer-only algorithm. This is not standard binary quantization (fixed-zero or per-vector mean) — our codec preserves semantic neighbourhood structure far better, so we beat fixed-threshold binary baselines on real BEIR queries. The full method is a trade secret protected under AU 2026904283.

32× / 48× / 96× / 128× depending on encoder + bit width
Integer-only — no floats at indexing
Deterministic, single portable file
Patent AU 2026904283 (codec + index)

Multi-Index Hashing (MIH)

The 1024-bit fingerprint is split into 64 sub-strings of 16 bits. Each sub-string indexes into a hash table; at query time, exact matches plus radius-1 Hamming variants per sub-table are merged into a candidate set, then re-ranked by full Hamming distance. Sub-linear exact nearest-neighbour search — no approximate structures.

64 hash sub-tables, 16 bits each
Sub-1ms query at small/medium N; ~12ms at 100M
Pure XOR + POPCNT — no FAISS, no HNSW, no ANN library
Patent AU 2026904283 (codec + index)

BGE-M3 Embeddings

NodeMind uses BGE-M3, the state-of-the-art multilingual embedding model with 1024 dimensions. Dense, sparse, and multi-vector representations are supported. The model is loaded once per worker — no repeated downloads.

MTEB top-ranked multilingual model
1024-dim dense vectors
Runs on community hardware (RTX 3080 + 128 GB RAM) — no datacenter required
Scales to zero when idle

Portable Index Files

After indexing, users download two zip files: the NodeMind binary index and a standard RAG float32 index. Both run completely offline using the included nodemind_local.py runner. No cloud subscription needed to query.

NodeMind zip: binary MIH index
RAG zip: float32 cosine index
Side-by-side benchmark built in
Auto-deleted after 24 hours

User uploads PDF
        │
        ▼
[ FastAPI — nodemind.space ]   ← nginx + SSL (Google Cloud VPS, 1TB)
        │
        ▼  submit job
[ Community Hardware: RTX 3080 + 128 GB RAM ]
  1. pdfplumber → chunks
  2. BGE-M3 → float32 embeddings (1024-dim)
  3. Patent-pending binary codec → 1024-bit fingerprints (32× smaller; up to 128× at 256-bit)
  4. MIH index: 64 sub-tables × 16-bit keys
  5. RAG index: float32 cosine (comparison baseline)
  6. Return nm_zip + rag_zip
        │
        ▼
[ VPS stores zips ]   ← auto-deleted after 24 hours
        │
        ▼
User downloads both — runs offline

Intellectual Property

Patent-protected technology

NodeMind's core algorithm is protected by an Australian provisional patent held by Sai Kiran Bathula, independent researcher, Coleambally NSW.

AU 2026904283 · Provisional

NodeMind Codec & Index

The proprietary integer-only encoding that converts float32 embeddings into compact binary fingerprints, plus the portable single-file binary fingerprint index format used at query time. Achieves 32× / 96× / 128× compression depending on encoder, with recall that beats fixed-threshold binary quantization on out-of-distribution BEIR queries. Sub-linear exact Hamming nearest-neighbour search on a single CPU core. Full method is a trade secret.

Get Started

Try NodeMind for free

No installation. No API key. Upload any PDF, TXT, or Markdown file at the live demo and get a portable binary index back in under 2 minutes.

Visit the demo

Go to nodemind.space and click Try Free. Enter your email — login is instant, no inbox check.

Upload a document

Drop any PDF, TXT, or Markdown file (10 MB per file, 50 MB lifetime per account). Community hardware (RTX 3080 + 128 GB RAM) picks it up, embeds with BGE-M3, and applies the NodeMind codec — typical 5,500-page PDF indexes in ~7 minutes.

Download your index

Once ready, download both the NodeMind binary index and the RAG float32 index. Run queries side-by-side offline.

Compare live

Use the Compare tab to run queries and see NodeMind vs RAG side by side — latency, index size, compression ratio, and result quality.

Open Live Demo →

Document search
at binary speed

RAG float32 vs NodeMind Binary

Built for every data type

How NodeMind works

Patent-pending Binary Codec

Multi-Index Hashing (MIH)

BGE-M3 Embeddings

Portable Index Files

Patent-protected technology

HiveMind — funded by NodeMind

Try NodeMind for free

Interested in licensing or collaboration?

Document searchat binary speed

RAG float32 vs NodeMind Binary

Built for every data type

How NodeMind works

Patent-pending Binary Codec

Multi-Index Hashing (MIH)

BGE-M3 Embeddings

Portable Index Files

Patent-protected technology

HiveMind — funded by NodeMind

Try NodeMind for free

Interested in licensing or collaboration?

Document search
at binary speed