ArkNill

Not training bigger models —
building better plumbing around them: collection, extraction, search, verification, guardrails.
AI pipelines that run without cloud API keys.

GitHub QuartzUnit PyPI Blog

About

Backend engineer with 6 years of experience. Worked across 6 heterogeneous databases (PostgreSQL, Oracle, MariaDB, MSSQL, DB2, Netezza) with native SQL tuning. Built DW/DM, BI/OLAP, and analytics systems for public and enterprise clients. Extensive deployment experience in air-gapped environments.

Currently building local-inference AI pipelines — from data collection to refinement, RAG, and fact-checking — everything runs on-premise without cloud API calls. Tools extracted from this work are published as open source under QuartzUnit.

Project Ecosystem

Each project is independent but connected by data flow. Raw data flows through refinement, mart, and agent layers.

Forge Data Refinement

Multi-source raw data to LLM-ready data warehouse. News, blogs, legacy DBs — any source refined. Includes fact-checking at 83.6% accuracy.

→

QubicAI Data Mart

Auto-generates data marts on top of Forge DW. Natural language to SQL — BI dashboards in one sentence.

→

Mirror Agent Autonomous Agent

Custom ReAct + RAG (Qdrant + Neo4j) + persistent memory + 2-tier LLM fallback. On-premise, no frameworks.

↑ Tool Layer

QuartzUnit OSS Open Source

10 Python packages extracted from the above projects. Collection, extraction, search, monitoring, guardrails — modular tool ecosystem.

All projects run on local inference (vLLM on-premise). No cloud API dependency.

QuartzUnit Packages

Each tool solves one problem. CLI + async Python API + MCP server — three interfaces as standard.

markgrab

URL to LLM-ready markdown. HTML, YouTube, PDF, DOCX.

114 testsPyPI 0.1.2

docpick

Schema-based document OCR to structured JSON.

217 testsPyPI 0.1.2

feedkit

RSS/Atom feed collection. 444 curated feed catalog built-in.

34 testsPyPI 0.1.1

browsegrab

Local LLM-optimized browser agent. Accessibility tree + token savings.

200 testsPyPI 0.1.1

snapgrab

URL to screenshot + metadata. Claude Vision optimized.

29 testsPyPI 0.1.0

diffgrab

Web page change detection + structured diff generation.

89 testsPyPI 0.1.0

embgrep

Local semantic search. Embedding-based grep — search by meaning.

74 testsPyPI 0.1.0

llm-degen-guard

LLM output degeneration detection. 4-signal composite scoring.

55 testsPyPI 0.1.0

agent-loop-guard

Agent infinite loop detection. Sliding window similarity.

78 testsPyPI 0.1.0

agent-action-policy

Declarative action policy. 4 built-in templates, zero dep.

69 testsPyPI 0.1.0

10 packages · 959 tests · MIT License · Korean + English docs

By the Numbers

6Heterogeneous DBs

10PyPI Open Source Packages

959Tests (Full Ecosystem)

444Curated RSS Feeds

800K+Collected Articles (115 domains)

83.6%Fact-checking Accuracy (Prod)

Tech Stack

Languages

PythonJavaKotlinTypeScriptSQL

Backend & AI

FastAPISpring BootvLLMQdrantNeo4j

Databases

PostgreSQLOracleMariaDBMSSQLDB2Redis

Frontend & Mobile

Next.jsReact NativeJetpack Compose

Infra

DockerGitHub ActionsCloudflarePlaywright