Lead Researcher — Model-Agnostic Infrastructure for Unstructured Financial Text
Dec 2025 – Present
Working Paper
Formalizing an end-to-end, model-agnostic architectural blueprint for processing high-dimensional, unstructured financial text, ensuring structural invariance against shifting baseline language architectures. Designing a modular data engineering pipeline that decouples stochastic noise-filtering mechanics from downstream deep learning systems, optimizing input embedding spaces to ensure future-proof compatibility. Formulating a novel, domain-specific reward function and objective metric that maps linguistic signal extraction directly to portfolio optimization, independent of the underlying transformer or LLM paradigm.
Architected a constraint satisfaction scheduling system for a 150+ user campus environment, formalizing combinatorial resource allocation as a CSP and applying backtracking search with constraint propagation to guarantee feasibility under competing institutional constraints. Engineered PostgreSQL indexing strategies that reduced query latency by 40% under high-cardinality constraint evaluations, and surfaced scheduling state through a real-time React visualization layer.
Developed a two-stage backtesting engine for 10 commodity sectors over 162 weekly periods using SLSQP min-variance optimization, achieving max 7.7% annual volatility. Enforced KKT optimality for stable weight constraints and mplemented Ledoit-Wolf covariance shrinkage and a Stage A signal blending / Stage B cross-sector allocation pipeline that reduced Max Drawdown by 15%.
Engineered a real-time computer vision system to monitor crop health in Martian greenhouses, utilizing AWS SageMaker for autonomous stress detection. Developed a multi-modal data pipeline to process environmental telemetry and spectral imagery, enabling automated nutrient deficiency diagnosis in extraterrestrial environments.
Originally designed as a predictive ML model for drought patterns in CDMX using hydrological data. Pivoted to a crowdsourced reporting system to bridge data gaps. Designed the data ingestion pipeline to validate user reports against historical meteorological norms.
Accuracy
70% – 85%
5-Day Horizon
Inference
< 500ms
Features
14 indicators
Source
OpenWeather + Crowdsourced
Systemic Risk ModelingData PipelinesCrowdsourcing
Bloomly: Global Bloom Detection System
Oct 2025
@ NASA Space Apps Hack
Engineered a LightGBM-based predictive model leveraging multi-spectral satellite imagery (GEE) and NASA POWER meteorological data. Conducted rigorous dimensionality reduction across 44 distinct ecological indicators to classify global bloom patterns with high precision (AUC/F1 validation).
ROC-AUC
0.72–0.85
F1 Score
0.70–0.82
Features
44 indicators
Source
GEE + NASA POWER
PythonLightGBMRemote Sensing (GEE)
Multi-Agent Simulation
Aug 2025 – Sep 2025
Simulated autonomous agent behavior using Python (Mesa) and Unity to analyze strategic decision-making dynamics. Designed reward-based optimization functions within constrained state spaces, applying Monte Carlo sampling to identify emergent Nash equilibrium patterns.