Show HN: QVAC SDK, a universal JavaScript SDK for building local AI applications

Hi folks, today we're launching QVAC SDK [0], a universal JavaScript/TypeScript SDK for building local AI applications across desktop and mobile.

The project is fully open source under the Apache 2.0 license. Our goal is to make it easier for developers to build useful local-first AI apps without having to stitch together a lot of different engines, runtimes, and platform-specific integrations. Under the hood, the SDK is built on top of QVAC Fabric [1], our cross-platform inference and fine-tuning engine.

QVAC SDK uses Bare [2], a lightweight cross-platform JavaScript runtime that is part of the Pear ecosystem [3]. It can be used as a worker pretty much anywhere, with built-in tooling for Node, Bun and React Native (Hermes).

A few things it supports today:

  - Local inference across desktop, mobile and servers
  - Support for LLMs, OCR, translation, transcription, 
    text-to-speech, and vision models
  - Peer-to-peer model distribution over the Holepunch stack [4],
    in a way that is similar to BitTorrent, where anyone can become a seeder
  - Plugin-based architecture, so new engines and model types can be added easily
  - Fully peer-to-peer delegated inference
We also put a lot of effort into documentation [5]. The docs are structured to be readable by both humans and AI coding tools, so in practice you can often get pretty far with your favorite coding assistant very quickly.

A few things we know still need work:

  - Bundle sizes are larger than we want right now because the current packaging of Bare add-ons is not as efficient as it should be yet
  - Plugin workflow can be simpler
  - Tree-shaking is already possible, but at the moment it still requires a CLI step, and we'd like to make that more automatic and better integrated into the build process
This launch is only the beginning. We want to help people build local AI at a much larger scale. Any feedback is truly appreciated! Full vision is available on the official website [6].

References:

[0] SDK: http://qvac.tether.io/dev/sdk

[1] QVAC Fabric: https://github.com/tetherto/qvac-fabric-llm.cpp

[2] Bare: https://bare.pears.com

[3] Pear Runtime: https://pears.com

[4] Holepunch: https://holepunch.to

[5] Docs: https://docs.qvac.tether.io

[6] Website: https://qvac.tether.io

30 points | by qvac 6 days ago

11 comments

  • shaz0x 14 hours ago
    Went through the SDK docs before asking. On RN/Expo specifically, does Fabric run inside a Bare worklet with IPC back to Hermes, or drop into a native module the way llama.rn does via JNI and llama.cpp? Perf and memory footprint would look very different between the two, curious which path you landed on.
  • WillAdams 5 days ago
    Do you really mean/want to say:

    >...and without permission on any device.

    I would be much more interested in a tool which only allows AI to run within the boundaries which I choose and only when I grant my permission.

    • elchiapp 5 days ago
      That line means that you don't need to create an account and get an API key from a provider (i.e. "asking for permission") to run inference. The main advantage is precisely that local AI runs on your terms, including how data is handled, and provably so, unlike cloud APIs where there's still an element of trust with the operator.

      (Disclaimer: I work on QVAC)

      • WillAdams 5 days ago
        OIC.

        Should it be re-worded so as to make that unambiguous?

      • sull 5 days ago
        thoughts on mesh-llm?
    • mafintosh 5 days ago
      The modular philosophy of the full stack is to give you the building blocks for exactly this also :)
      • WillAdams 5 days ago
        Looking through the balance of the material, I can see that, but on first glance, this seems a confusible point.
  • angarrido 5 days ago
    Local inference is getting solved pretty quickly.

    What still seems unsolved is how to safely use it on real private systems (large codebases, internal tools, etc) where you can’t risk leaking context even accidentally.

    In our experience that constraint changes the problem much more than the choice of runtime or SDK.

    • elchiapp 3 days ago
      Curious to hear what constraints are there that aren't tackled by the current offering of local runtimes/SDKs for inference.
  • moffers 5 days ago
    This is all very ambitious. I am not exactly sure where someone is supposed to start. With the connections to Pear and Tether I can see where the lines meet, but is the idea that someone takes this and builds…Skynet? AI Cryptocurrency schemes? Just a local LLM chat?
    • elchiapp 5 days ago
      You can build anything! Check out our tutorials here: https://docs.qvac.tether.io/sdk/tutorials/

      Although an LLM chat is the starting point for many, there are many other use cases. We had people build domotics systems to control their house using natural language, vision based assistants for surveillance (e.g. send a notification describing what's happening instead of a classic "Movement detected") etc. and everything remains on your device / in your network.

  • elchiapp 5 days ago
    Hey folks, I'm part of the QVAC team. Happy to answer any questions!
    • knocte 5 days ago
      Are there incentives for nodes to join the swarm (become a seeder)? If yes, how exactly, do they get paid in a decentralized way? Any URL where to get info about this?
      • mafintosh 5 days ago
        its through the holepunch stack (i am the original creator). Incentives for sharing is through social incentives like in BitTorrent. If i use a model with my friends and family i can help rehost to them
  • yuranich 5 days ago
    Hackathon when?
  • plur9 4 days ago
    [dead]
  • eddie-wang 5 days ago
    [dead]
  • tuxnotfound 5 days ago
    [dead]
  • tuxnotfound 5 days ago
    [dead]