Javier's Portfolio

About

I am a data professional living in Vancouver, BC, with over two-year experience as a Data Engineer in the retail industry, focused on designing and building data-driven software solutions. I am currently completing my Master's Degree in Data Science and will be available for full-time work starting July 2025.

What I do!

With more than two years of experience in data engineering, I build modern big-data pipelines and workflows with a wide range of technologies.

My background includes:

Building ETL/ELT pipelines across diverse platforms and frameworks
Deploying and operating data platforms on AWS (S3, Glue, Athena, Lambda, etc)
Orchestrating distributed workloads with PySpark and Airflow
Developing high-quality Kafka producers and consumers
Designing efficient warehouses and query systems on Snowflake and other modern OLAP tools
Creating stakeholder-focused visualizations in AWS QuickSight, Grafana, Dash, and Streamlit
Setting up CI/CD pipelines to automate and safeguard data workflows

What sets me apart is not just my technical toolkit but my commitment to continuous learning and adaptability. Whether I’m defining data contracts, extending CI/CD practices, or adopting new architectural patterns, I pursue the most effective solution for each challenge. I’m especially driven to deliver end-to-end data solutions that maximize business value.

Data Engineering

Building modern big data pipelines and workflows.

Although my professional roles have centered on data engineering, my Master’s in Data Science has given me a foundation in machine learning. Through coursework and independent projects I have designed and implemented solutions across the full ML spectrum:

Technical strengths:

Supervised learning: end‑to‑end modeling pipelines with scikit‑learn, XGBoost, LightGBM, Random Forests, etc. Covering feature engineering, hyper‑parameter tuning and model evaluation
Unsupervised learning: word‑embedding methods (Word2Vec, GloVe), topic modeling with LDA and BERTopic, and recommender systems that leverage collaborative and content‑based filtering
Deep learning: hands‑on experience building and fine‑tuning CNNs, RNNs/LSTMs, and Transformer architectures in PyTorch.
MLOps fundamentals: experiment tracking in MLflow, version control with DVC, and containerized deployment prototypes on AWS (ECR/ECS)
Project mindset: rapid‑prototyping notebooks promoted to production‑ready code, clear documentation and visualization of model insights for non‑technical stakeholders.

While I have not yet held a formal Machine Learning Engineer title, I consistently seek out ML‑focused projects—ranging from sports‑analytics models to NLP prototypes—to deepen my expertise. I approach each new challenge with the curiosity and discipline needed to translate state‑of‑the‑art research into practical, business‑ready solutions.

Machine Learning

Developing end‑to‑end ML solutions from prototype to production.

My focus on data architecture complements my engineering and ML skills, enabling me to translate business requirements into resilient, high‑performance data platforms, always trying to select the best design patterns and architectural approach for the problem at hand.

Core capabilities:

System design & architecture: blueprinting end‑to‑end solutions that balance scalability, cost, and maintainability
Data modeling & lake‑house patterns: crafting relational and dimensional models, plus lake architectures (e.g., S3 + Iceberg) that support both analytics and ML workloads
ML‑ready pipelines: integrating feature stores, model‑training jobs, and real‑time/ batch inference into unified workflows
Cloud infrastructure on AWS: provisioning secure, elastic stacks with services like S3, Glue, Athena, EMR, Lambda, and Redshift
Docker containers: containerizing applications for consistent and repeatable deployments.

Continuous improvement mindset:

Expanding DevOps proficiency with Terraform/IaC, Jenkins CI/CD, and Kubernetes for container orchestration
Evaluating emerging frameworks: Apache Iceberg, Delta Lake, dbt and data‑contract tooling—to adopt best‑in‑class patterns
Applying proven design principles (e.g., medallion, domain‑driven design, event sourcing) to ensure each architecture fits the problem rather than forcing a one‑size‑fits‑all approach

I actively pursue new technologies and architectural patterns, always aiming to deliver data platforms that are reliable today and adaptable for tomorrow’s demands.

Data Architecture

Designing scalable, cloud-native data platforms and lake-house architectures.

My software‑engineering background rounds out my data skill set, letting me deliver end‑to‑end products that are as dependable as they are user‑friendly.

Key strengths

Full‑stack development: modern web/mobile apps and APIs (REST, GraphQL, Webhooks) with React/React native, Astro, and Next.js on the front-end, paired with Django, Node.js and FastAPI on the back-end
Engineering fundamentals: solid grounding in data structures & algorithms, software‑design patterns, and agile methodologies
Quality & reliability: comprehensive testing strategies, CI/CD pipelines, and automated deployments for rapid yet safe iteration
Operations: microservices architecture and Docker‑based containerization for scalable, maintainable services
Lifecycle stewardship: clear documentation, observability, and structured maintenance processes that keep applications healthy in production

I continually evaluate new languages, frameworks, and architectural patterns to ensure each solution is the best possible fit for the problem at hand.

Software Engineering

Crafting robust full‑stack applications and API‑driven services.