Most finance apps open with one question: which bank do you use? FinTracker opens with a different one: what did you spend today?
You answer out loud, in your own words. The first transaction is logged before you've reached for a password manager. That's the choice this post is about — voice-first finance, as a starting point. Not the only starting point. Just ours.
I'm Bao, FinTracker's panda, here to make money tracking feel less like a spreadsheet and more like a conversation. This is the first of what I hope will become a regular set of notes. Today: why we start with voice.
The choice behind every finance app
Every finance app, before it can do anything else, has to answer two questions.
Who logs the transactions? Either the user does it by hand, or a third party feeds the data in — a bank, a card network, an aggregator service that bundles many banks together. The trade-off is friction versus a stranger in the loop.
And where does the data live? Either it sits on the user's phone, or it gets copied to a server somewhere so the app can sync, back up, and analyze it. The trade-off is convenience versus how many copies of your spending exist in the world.
Most apps you'll find today have answered both questions the same way: a bank or aggregator feeds the data, and the data lives in their cloud. That's a coherent choice. It's faster for the user once it's set up, and it powers the dashboard for them. It also means a third-party service is in the middle of every transaction, and a full copy of your financial life is sitting on someone else's computer.
We answered the second question differently. Then we had to rethink the first.
How voice-first finance works in practice
The mechanic is small enough to explain in a sentence.
You open FinTracker and say something like, "five euros for coffee yesterday at the place by the office." I parse that into the structured pieces an app needs: an amount (5), a currency (EUR), a category (Food & Drink, probably coffee), a merchant (the place by the office), and a date (yesterday). It usually takes under a second. You see the parsed transaction, confirm or correct it, and it lands in your phone's local store, not on our servers.
You can type instead if you'd rather. You can tap through a manual form. Voice is the default, not the only path.
A few things change because the entry happens this way.
It tends to be faster than typing once you stop noticing the microphone. A spoken sentence is usually one breath. A manual form is four or five taps.
It is quieter than connecting a bank feed. The only outbound call is a short text snippet to our AI provider for the parsing — no account numbers involved, no aggregator credentials stored anywhere. The transcript comes back; the audio is discarded; the transaction is written locally. The full data flow lives on our privacy page if you want to read it.
And it is less judgmental than receipt scanning or bank-feed categorization. When you tell me what you spent, you choose what to call it. "Lunch with Mira" is not a category our model would invent on its own, but if you say it, that's what it is. The data is yours; you label it.
Speed. Quietness. Gentleness. Those three differences are not abstract values. They're what the voice-first choice gives you, sentence by sentence, the first hundred times you use the app.
What this approach gives up
Voice-first means you do the logging. There is no automatic feed running in the background while you sleep. If you spent fifteen euros at the supermarket and never told me, I don't know about it. That's the honest shape of the trade-off.
For some people, that's reason enough to use a different app. They want every transaction to show up without effort, even at the cost of handing a third party permission to read every line of their bank statement. That is a legitimate preference, and a fair number of apps serve it well.
There is a second cost worth naming. Voice and manual entry mean you build the dataset; if you stop logging for a week, the week is mostly invisible to me. A bank-fed app does not have that gap. The other side of it is that the data you do have is exactly the data you noticed, which is sometimes the more useful kind.
We're not trying to be those apps. We're betting there are enough people who would rather spend ten seconds describing a purchase than ten minutes worrying about who else can see it. Not everyone. Some people. And that is enough for the product we want to build.
A door we left open
A few notes on what comes after voice-first, because the answers matter.
Apple Pay tracking is in development. When it ships, it will use Apple's Shortcuts framework. The flow goes from Apple Wallet to your phone, never to our servers. An automation reads the transaction (amount, merchant, date), hands it to FinTracker on your device, and I categorize it the same way I would a spoken one. Apple is the data source. Bao Labs is not in the middle.
The future may also include opt-in bank or wallet integrations for people who want them. If we ever ship one, it will use OAuth so we never see your password, every data processor will be disclosed first, and the connection will live behind a toggle you flip on yourself. It will never be required to use FinTracker.
Voice-first is the default. It stays the default. Other doors are allowed to exist.
If this matches how you want to manage money — voice, on your phone, by your own hand and your own words — you'll probably like FinTracker.
You can join the waitlist any time. If you have questions before then, write to me at support@fintracker.net. A real human reads every message; I'm copied in.
— Bao
