renzo rico · data scientist · barcelona

|

I take messy, unstructured problems and turn them into working systems — LLM pipelines, NLP corpora, interactive data tools. End-to-end: raw source to deployed product, no handoffs.

4live deployed tools

sole builderon every project: data to deployed UI

scraping → ETL → ML → agentic AI → deployed product

renzo@local: ~/projects/ds-radar

█

renzo@local: ~/projects

█

$ cat brief.txt

BRIEF─$ cat brief.txt

open to roles

role targetData Scientist · product-facing

locationBarcelona · remote or relocation in Europe

availableNow

core stackPython · SQL · LLMs · NLP · TypeScript · end-to-end, no handoffs

three proofs

→Agentic pipeline that scrapes job boards, ranks listings against a candidate profile, and exports daily matches[ds-radar]

→Voter-alignment quiz for the Colombian 2026 election — live, sourced, and transparent[no-botes-tu-voto]

→Interactive geospatial atlas across 600+ London neighborhoods[the-london-bible]

github ↗linkedin / cv ↗email ↗

$ cat about.md

About

Renzo Rico

Trained as an architect. Ended up here.
Originally from Colombia, based in Barcelona.

github ↗linkedin ↗

Trained as an architect. Ended up building with data. Both fields reward the same instincts: think in systems, care about the parts nobody sees, and know when something is not finished yet.

The work I tend to end up doing starts at the messy end. There is a source nobody has cleaned, a format nobody has parsed, a system that does not connect to anything useful yet. I work through that part, and then the part after it. The full chain, from whatever the raw input is to something deployed.

Looking for a data science role where the job is to build things and ship them.

$ cat capabilities.md

Capabilities

Agentic & LLM systems

Building production systems around LLMs — not demos, not wrappers. Multi-agent coordination, structured output schemas, eval loops, temperature and cost calibration.

proven by

→

[ds-radar]3-agent pipeline (scraper → parser → scorer). Structured JSON schema enforcement + near-zero temperature to prevent scoring drift across runs.

→

[no-botes-tu-voto]Multi-pass LLM stance classification with explicit uncertainty classes — handled political ambiguity without forcing false alignment where evidence was thin.

pythonllmsai-agentsprompt-engineering

Data engineering & NLP

ETL from hostile sources, text corpora at scale, preprocessing for domain-specific language. The pipeline work nobody else wants to touch.

proven by

→

[legalize-co]Fetched 71.5k Colombian laws from a source with broken SOAP XML and a broken TLS chain — routed around both without compromising the corpus.

→

[un-speeches]8k+ speech corpus with custom preprocessing pipeline and expanded stop-word lists for UN-specific jargon that tripped up general-purpose NLP models.

pythonpandasnlpweb-scrapingetl

Interactive data products

Data tools that users can actually use. Geospatial analysis, network graphs, civic apps. End-to-end ownership of the UI layer from data schema to deployed interface.

proven by

→

[the-london-bible]Multi-layer London atlas normalized across mismatched geographic granularities — wards, postcodes, LSOAs — unified into a coherent hex-grid for the UI.

→

[bjj-universe]1000+ node network graph optimized with octree spatial partitioning and instanced mesh rendering to hold 30fps on commodity hardware.

typescriptnext.jspythonmaplibresigma.js

how I work · team value

ambiguity → structure

I map the problem before opening an editor. Saves weeks of wrong-direction work.

end-to-end ownership

No handoff dependency. Data contracts, models, APIs, and UI — I own the full chain.

tradeoff documentation

I write down what I chose not to build and why. Teams spend less time relitigating.

shipping instinct

Deployed products, not notebooks. If it doesn't run for a user, it's not done.

$ ls exhibits/

Selected work

EX-01─ds-radar· Sole builder — agent architecture, LLM scoring, pipeline CLI

source ↗

problemJob searching for data roles is repetitive and noisy. Good opportunities are scattered across sources, and evaluating each listing manually does not scale.

inputCSV job feeds · structured candidate profile · tracker state · evaluation history

approachBuilt an automated job scanning, evaluation, and tracking pipeline. New feeds are ingested, listings are evaluated against a structured profile, decisions are written into canonical tracker files, and eval markdowns stay linked to the operational state.

challengeKeeping the pipeline reliable as it evolved. I tightened the system around a single source of truth — tracker.tsv, scan-history.tsv, and eval artifacts — so repairs, history, and downstream tooling all point to the same canonical state.

resultA reproducible workflow for ingesting job feeds, scoring relevance, tracking decisions, and generating linked evaluation artifacts for DS and analytics job searches.

pythonllmsai-agentsdata-pipelines

hover to expand ›

EX-02─no-botes-tu-voto· Sole builder — data curation, LLM classification, Next.js frontend

live ↗source ↗

problemPresidential campaigns produce long, ambiguous political messaging, but voters need a clearer way to compare candidates on concrete issues.

inputDocumented candidate positions · 25 quiz questions · 7 key themes · 6 presidential candidates

approachBuilt an independent voter-alignment tool for the Colombian 2026 election. Users answer 25 questions, and their responses are compared against documented candidate positions with transparent sourcing and methodology.

challengePolitical positions are often vague or incomplete. The product had to stay useful without pretending every candidate had a clean, fully structured stance on every issue.

resultA live public quiz experience for Colombia 2026 that helps users compare six candidates across seven key topics using documented sources and a transparent methodology.

llmsnext.jstypescriptprompt-engineering

hover to expand ›

EX-03─the-london-bible· Sole builder — data pipeline, geospatial normalization, map UI

live ↗source ↗

problemLondon is experienced as layers — transport, density, amenities, schools, hospitals, housing, and geography — but those layers are rarely explored together in one place.

inputLondon borough and ward geometry · MSOA density data · Tube lines · bikes · POIs · schools · hospitals and other civic overlays

approachBuilt a self-contained London atlas: a static web app that combines multiple civic and spatial layers into a single editorial mapping experience with switches for views, metrics, overlays, and location-based exploration.

challengeThe hard part was not just gathering datasets, but turning them into a coherent and legible map product with consistent overlays and a browsing experience that invites comparison rather than overwhelming the user.

resultA deployed interactive London atlas that lets users explore density, transport, amenities, and civic infrastructure through one layered map interface.

pythonmaplibregeojsonnext.js

hover to expand ›

$ ls projects/

More projects

EX-04un-speechesAn interactive UN speeches analysis project that turns a large diplomatic text archive into something searchable, inspectable, and analytically usable.

live ↗source ↗

EX-05legalize-coA growing open-source corpus of Colombian legislation in Markdown, versioned in git, with 71,500 laws committed in the current repository state.

source ↗

EX-06bjj-universeA deployed interactive BJJ graph experience backed by processed ADCC data, with a production-grade frontend foundation and a clear path toward deeper competition analytics.

live ↗source ↗

$ ./contact.sh

Get in touch

emailrenzorico10@gmail.comFor roles, collaborations, or questions githubgithub.com/renzoricoAll projects, open source linkedinlinkedin.com/in/renzoricoProfessional profile and connect