First release of Bnklab
· 3 min read
Introduction
What is this thing? Bnklab is a local-first tool for data cleaning, labeling, and validation. It works offline, supports CSV, JSON, Excel, and Parquet, and combines automatic processing with manual changes and review.
The goal is to be as frictionless and ergonomic as possible. This means user-defined data sources, customizable actions and keybindings for triggers. No complex setup or coding required.
While the primary interface to the data is a table, it is not a spreadsheet. If a spreadsheet is Lego, this is Duplo: simpler and designed for a more specific purpose.
Background
It started as an internal tool for extracting and categorizing text using multiple regex rules, later supplemented by local language models (AI). The results were then manually verified and marked.
Sometimes the text alone wasn't enough, but the data included images that could help determine the correct result, either through AI or manual review. This however made for a very repetitive workflow where two things became essential: ergonomic keybindings for speed and to reduce fatigue, and clear visual cues (like distinct row colors per status) to track what had been processed or verified.
Progressive Web App (PWA)
A key decision for the app architecture is that all data stays locally on the user's computer. This is done using the browser's File System API for the data and the localStorage for settings. One limitation of this approach is that the maximum file size is around 4GB, depending on the browser and platform.
AI & LLMs
Bnklab supports local language models as well as cloud AI providers. The app is still in early stages, so if you are using a cloud provider, set strict budget limits and monitor usage to avoid unexpected costs.
Concepts
- Import: Bring in data from various file types (CSV, JSON, Excel, Parquet)
- Sources: Define where data comes from, like APIs
- Actions: Customizable operations that can be triggered by keybindings or run automatically
- Quick values and highlighting: Visual cues and shortcuts for common values and statuses
- Table: The main interface for viewing and editing data
- Workspaces: Organize your work into separate workspaces with their own data and settings
- AI backends: Support for local and cloud AI models for data processing and validation
- Runner: A system for executing actions, either manually or automatically based on triggers