Projects
Automated pipeline for landscape analysis in biotech and deeptech — scrapes sources, extracts structured data via LLM, and produces formatted HTML reports.
Deep tech fields are paper-heavy by design. Most competitive mapping tools miss the academic layer entirely — or return noise because they have no domain context. This tool turns that constraint into a signal: author affiliations in papers name the labs and institutes doing the work, and those become discovery candidates. The pipeline verifies each candidate against topic keywords, runs two sequential LLM passes to filter and structure the results, and produces a self-contained HTML report. The input parameters are fully configurable — the same pipeline maps any technical landscape, not just the niche it was prototyped on. A field that would take hours of unstructured searching returns a structured overview in five minutes.
Python · Streamlit · Anthropic API · BeautifulSoup · Ollama
GitHubAI-powered personal recommender across ~20k fragrances — TF-IDF on olfactory note pyramids, rating-based profile building, and LLM-generated explanations.
The question that made this interesting was whether a weighted TF-IDF vector over olfactory note pyramids could actually capture something real about taste. Fragrances are described as structured lists of notes — top, middle, base — which means they can be treated as a text corpus. A personal profile is built from your own ratings: each fragrance's note string is weighted by how much you liked it, accumulated over time. Cosine similarity finds the nearest neighbors. An avoid penalty subtracts similarity to things you've rated poorly. The results look promising — and the domain isn't incidental. It's a case study in what happens when you apply scientific attention to something personal.
Python · Streamlit · scikit-learn · Anthropic API
GitHubTraining log built around the Relative Training Volume metric, with AI coaching analysis to flag recovery debt and progression patterns.
Raw volume — kilograms lifted, sets completed — is almost meaningless without context. A 60 kg bench press means something very different to a 60 kg athlete than to a 90 kg one. RTV normalizes load by bodyweight, producing a dimensionless number that is comparable across time and across people. The metric wasn't found — it was reasoned into existence from first principles, then confirmed. The app stores bodyweight with dates so historical sessions are never retroactively recalculated with current weight. A radar chart maps muscle group balance against a reference athlete and against your own previous periods. The LLM coaching layer reads structured session data — not free text — and closes every analysis with a Dodgeball quote.
Python · Streamlit · Plotly · Anthropic API
GitHubA graph-based second brain for sustained intellectual exploration — maps ideas and connections across reading sessions, built around a persistent knowledge graph that grows more useful over time.
I was reading Judt, Fisher, Han — dense, interlocking arguments — and linear notes weren't working. Ideas connect in all directions, not in sequence. So I built a tool that treats knowledge as a graph. Each exploration session with Claude generates a structured JSON export that feeds into the graph. Nodes are concepts, edges are connections, and the global view lets you filter branches and focus on individual nodes with their neighbors. The graph becomes more useful the more you use it. Built for a specific deep dive, designed to generalize to any domain of sustained intellectual inquiry.
Python · Streamlit · Claude API · NetworkX · Plotly
GitHub