Skip to content
View YanCotta's full-sized avatar
🧬
Excelsior
🧬
Excelsior
  • OrangeDoor IT / GETTER / VivaTerra Ventures
  • Brazil
  • 21:18 (UTC -03:00)
  • LinkedIn in/yan-cotta
  • X @CottaYan

Highlights

  • Pro

Block or report YanCotta

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
YanCotta/README.md

YanCotta

Due to NDAs and corporate policies, the vast majority of my professional-grade code resides in private repositories. The projects showcased here are primarily my academic and personal side-projects where I experiment, build, and deploy end-to-end systems from scratch.


🏆 TL;DR - ACHIEVEMENTS

Quick summary of awards and key metrics

AI Solutions & Delivery Lead @ OrangeDoor IT: Leading technical delivery for one of the world’s largest Salesforce Agentforce implementations (São Paulo’s Poupatempo platform, 46M+ citizens, ~200M BRL contracts). Owning full lifecycle delivery in 1–2 month cycles while bridging government stakeholders and world-class Salesforce teams.

AI Architect & Strategist @ GETTER S.A.: Driving AI architecture and pre-sales strategy for a Top 10 Industry 5.0 startup and South Summit Top 50 finalist (out of 2,500+ global startups). Architecting multi-agent systems for enterprise clients including Jabil, Valgroup, Renault Global, and Vale.

AI Architect & Product Strategist @ VivaTerra Ventures: Orchestrating 0-to-1 AI architecture for a VC-backed global stealth startup. Secured early access to frontier Google/DeepMind technologies (AlphaEarth, AlphaEvolve, Co-Scientist) and supporting a 150M BRL seed raise while designing planetary-scale multi-modal systems.

Data Manager @ Embrapa Dairy Cattle: Architected high-performance data pipelines and the “Sovereign Bio-Graph” for one of the world’s largest international genomic improvement programs across 10+ countries.

AI Engineer @ FrameNet Brasil: Engineered a hybrid neuro-symbolic AI system achieving a 6x performance increase in abstract semantic reasoning by fusing Vision Transformers with linguistic logic.

1st Place Winner – Reply Enterprise Challenge (FIAP NEXT 2025): Solo-architected a production-grade 12-agent predictive maintenance platform, validated at 103.8 RPS with 3ms P99 latency and proven to reduce unplanned downtime by 40%.


📑 QUICK NAVIGATION





👨‍💻 ABOUT ME


Operating at the intersection of AI, Solutions Architecture, Data Governance, C-Level Strategy, and Business Value, designing the infrastructure and technology of the next economic era.

As a Fractional AI Solutions Architect, I partner with disruptive startups and global enterprises to design and govern safe, state-of-the-art AI ecosystems that drive massive market ROI. I bridge the gap between deep R&D and the boardroom, translating complex data physics into scalable B2B revenue.

💼 Fractional Mandates & Strategic Architecture

  • AI Solutions & Delivery Lead @ OrangeDoor IT: Leading technical delivery and governance for one of the largest global Salesforce Agentforce implementations (São Paulo’s Poupatempo, 46M+ citizens). Acting as the central bridge between PRODESP stakeholders, Salesforce architects, and delivery squads while owning the full product lifecycle in aggressive 1–2 month cycles.

  • AI Architect & Strategist @ GETTER S.A.: Driving multi-agent system architecture and global pre-sales strategy for a Top 10 Industry 5.0 startup and South Summit Top 50 finalist. Supporting enterprise engagements with clients including Jabil, Valgroup, Renault Global, and Vale.

  • AI Architect & Product Strategist @ VivaTerra Ventures: Leading 0-to-1 technical architecture and product strategy for a VC-backed stealth startup. Secured early access to frontier Google/DeepMind technologies and architecting planetary-scale multi-modal data pipelines while supporting a 150M BRL seed raise.

  • Data Manager: Strategy & Innovation @ Embrapa/ABCGIL: Architected sovereign data ecosystems and high-performance pipelines for international genomic programs spanning 10+ countries, bridging PhD researchers and mission-critical business KPIs.

🔬 Intellectual Moat & Scientific Authority

  • Agentic AI Research @ UFJF (M.Sc. CS): Researching Cognitive Multi-Agent Systems (MAS) for autonomous decision-making and semantic interoperability in heterogeneous environments.
  • Award-Winning Systems: Solo-architected the 1st place winner of the Reply Enterprise Challenge (FIAP NEXT 2025)—a production-grade platform for predictive maintenance.




💼 PROFESSIONAL & RESEARCH EXPERIENCE

Proven track record in AI Engineering, Leadership & Bioinformatics

AI Solutions & Delivery Lead | OrangeDoor IT | São Paulo, Brazil (Hybrid) | Mar 2026 – Present

Leading technical delivery and architecture for one of the world’s largest Salesforce Agentforce implementations — São Paulo’s Poupatempo platform serving 46M+ citizens and backed by contracts valued at ~200M BRL.

  • Own the full product lifecycle (discovery → architecture → development → QA → PROD → sustainment), consistently delivering iterations in 1–2 month agile cycles.
  • Serve as the primary technical authority bridging PRODESP/SGGD stakeholders, Salesforce architects, and multi-disciplinary delivery squads.
  • Architect and govern complex multi-agent systems integrating Agentforce, Data Cloud, MuleSoft, and custom LLM layers in a high-stakes government environment.
  • Drive technical governance, risk mitigation, and scope negotiation while maintaining delivery velocity and solution integrity.

Key Areas: Agentforce Multi-Agent Systems Enterprise AI Technical Governance Product Ownership Agile Delivery


AI Architect & Strategist | GETTER S.A. | Remote (Manaus, AM) | Mar 2026 – Present

Driving AI architecture and strategic consulting for a Top 10 Industry 5.0 startup and South Summit Brazil Top 50 finalist (out of 2,500+ global startups). Acting as the critical nexus between high-stakes business development and elite engineering for global enterprise clients.

  • Architecting and governing complex multi-agent/agentic infrastructures for clients including Jabil, Valgroup, Renault Global, and Vale.
  • Leading pre-sales architecture and proposal development for large-scale enterprise contracts.
  • Providing strategic technical consulting on system design, tech stack decisions, and prioritization to maximize business ROI.
  • Aligning visionary Industry 5.0 goals with executable, high-impact technology solutions across engineering, product, and executive stakeholders.

Key Areas: Multi-Agent Systems Industry 5.0 Enterprise AI Pre-Sales Architecture Tech Governance


AI Architect & Product Strategist | VivaTerra Ventures | Remote (São Paulo, SP) | Mar 2026 – Present

Orchestrating 0-to-1 AI architecture, product strategy, and data science for a VC-backed global stealth startup at the intersection of applied biology, multi-modal AI, and decentralized systems.

  • Secured early access to frontier Google/DeepMind technologies (AlphaEarth, AlphaEvolve, Co-Scientist) and collaborating with DeepMind professionals and UEMS scientists.
  • Architecting autonomous multi-modal data pipelines and complex system architectures for planetary-scale data processing and a platform designed for millions of global users.
  • Supporting a 150M BRL seed raise through high-impact technical pitches to investors, VCs, governors, and enterprise stakeholders.
  • Co-piloting product strategy and UX vision while ensuring deep scientific rigor and systemic constraints are preserved in the interface layer.

Key Areas: Multi-Modal AI Planetary-Scale Systems Enterprise AI Architecture Product Strategy Frontier AI


Data Manager: Strategy & Innovation | Embrapa/ABCGIL | Juiz de Fora, MG | Nov 2025 - Apr 2026

  • Sovereign AI & LLMOps: Architected the "Sovereign Bio-Graph," moving national genomic assets to air-gapped high-performance computing, and modernized legacy pipelines by implementing state-of-the-art LLMOps and automated workflows.
  • Global R&D Leadership: Bridged the gap between PhD researchers, world-class laboratory partners (Zoetis, Neogen), and industry-leading associations (ABCGIL), aligning complex genomic R&D with business KPIs across 10+ countries.
  • Enterprise Data Engineering: Engineered robust ETL pipelines (Python, Pandas, Linux) and managed hybrid database solutions using PostgreSQL (Relational) and Neo4j (Graph) to ingest and standardize complex genomic structures.

Key Areas: Sovereign AI LLMOps Data Engineering Graph Databases ETL Pipelines Neo4j PostgreSQL


AI Researcher/Engineer (Neuro-Symbolic Architecture) | FrameNet Brasil / UFJF | Juiz de Fora, MG | Oct 2025 - Mar 2026

  • Architectural Innovation (Project ReINVenTA): Led the engineering of a Hybrid Neuro-Symbolic System under the supervision of PhD researchers. Fused visual perception with structured linguistic logic to solve critical data quality issues (noisy labels) and scarcity constraints.
  • SOTA Performance: Achieved a 6x improvement in multi-label classification accuracy, transforming a failing baseline into a robust model capable of abstract semantic reasoning.
  • Advanced Tech Stack: Engineered end-to-end multi-modal solutions utilizing PyTorch, OpenAI CLIP (ViT-B/32), YOLOv8, and Vision Transformers (ViT), applying techniques like Asymmetric Loss (ASL) and Zero-Shot Learning in an NVIDIA A30 GPU optimized environment.

Key Areas: Neuro-Symbolic AI Computer Vision PyTorch Hugging Face Zero-Shot Learning


Project Lead: AI/ML Engineering & Data Science | SuperDataScience | *Remote | Jun 2025 - January 2026

  • Agentic AI Leadership: Architecting "FinResearch AI", a multi-agent system using CrewAI to automate institutional financial research, pivoting teams from static notebooks to production-grade orchestration.
  • Predictive ML Platforms: Delivered full-stack healthcare (GlucoTrack) and HR analytics (MLPayGrade) platforms using Deep Learning, Model Explainability, and Tabular Embeddings.
  • Global Team Management: Orchestrating the full lifecycle for diverse international cohorts, aligning KPIs, conducting 1x1 mentorship, and enforcing software engineering best practices for scalable deployment.

Key Areas: Agentic AI Multi-Agent Systems CrewAI Technical Leadership LLMs RAG Full-Stack ML


Data Engineer (R&D) | Embrapa Gado de Leite | Juiz de Fora, MG | Sep 2025 - Nov 2025

  • Increased performance by 87% of genomic queries by migrating from PostgreSQL to Neo4j.
  • Architected a scalable bioinformatics fullstack pipeline for genomic analysis (Docker, Nextflow, FastAPI).
  • Optimized project presentations for stakeholders and executives responsible for laboratory budget and resources.

Key Areas: Genomics Bioinformatics Data Engineering Neo4j


AI Trainer (LLM Systems via RLHF) | Outlier | Remote | Nov 2024 - Sep 2025

  • Developed technical content to align Large Language Models (OpenAI, Meta, Anthropic), increasing model efficiency by 64% via RLHF in collaboration with technical teams.

Key Areas: RLHF Model Alignment AI Safety LLMs Quality Assurance


Data Analyst (Ecological Impact) | Impaakt | Remote | Feb 2022 - Oct 2024

  • Delivered 500+ data-driven ecological impact reports that influenced ESG (Environmental, Social, and Governance) ratings used by investment firms.

Key Areas: Environmental Science Sustainability Analysis Data Analysis Process Optimization AI Integration Impact Assessment


Research Assistant | Georgia State University | Atlanta, GA | Feb 2019 - Feb 2020

  • Increased research productivity by 84% by automating data collection and analysis workflows using Python.

Key Areas: Cognitive Sciences Philosophy of Mind Psychology Behavioral Analysis Research Methodology Data Analysis Data Science Python



🎓 ACADEMIC BACKGROUND


Master of Science (M.Sc.) - Computer Science | Universidade Federal de Juiz de Fora (UFJF) | 2026 - 2028 (expected)

  • Research Focus: Architecting Cognitive Multi-Agent Systems (MAS), semantic interoperability (Ontologies) in heterogeneous data, and autonomous decision-making based on Green AI.
  • Key Graduate Coursework: Intelligent Agents • Autonomous Software Systems • Artificial Intelligence in Software Engineering • Machine Learning • Applied Intelligent Systems.
  • Academic Excellence: Admitted with a 91.25 score on the Scientific Research Project defense.

Bachelor of Technology (Technologist Degree) - AI Systems & Machine Learning | FIAP | 2024 - 2026

Key Areas: AI Systems Architecture Machine Learning Engineering MLOps Edge AI IoT Development Software Engineering Data Engineering Cybersecurity Cloud Operations

Academic Excellence: GPA 4.0


Bachelor of Science - Biological Sciences | UniAcademia | 2022 - 2025

Key Areas: Molecular Biology Genetics Computational Biology Research Methodology Laboratory Management Scientific Publishing

Academic Excellence: GPA 3.7 | Thesis: Epigenetics Antiaging Health Software Leveraging Machine Learning & Deep Learning Algorithms


Bachelor of Science - Philosophy (Major) & Psychology (Minor) | Georgia State University | 2017 - 2020 (incomplete)

Key Areas: Cognitive Sciences Philosophy of Mind Psychology Human Behavior Research Methodology Academic Leadership

Academic Excellence: GPA 3.8 | Thesis: Differentiating Factual Belief, Imagination & Religious Credence - A Systematic Theory of Cognitive Attitudes

Additional Recognition: Columnist for "The Signal" (GSU's award-winning newspaper), Atlanta Campus Scholarship recipient, Dean's List, Honor Society member





🤝 PROFESSIONAL RECOMMENDATIONS

What others say about working with me

View all recommendations on LinkedIn

I've been fortunate to work with exceptional professionals who have recognized my technical capabilities, problem-solving approach, and collaborative leadership style. These recommendations span my work in:

  • AI/ML Engineering & Research
  • Data Science & Analytics
  • Project Leadership & Team Collaboration
  • Academic Research & Scientific Methodology




End-to-end AI systems architected to solve real-world challenges

This portfolio showcases end-to-end AI systems I've architected to solve real-world challenges. Each project demonstrates business impact, technical excellence, and production-ready implementation.


🏭 Smart Maintenance SaaS (Hermes)

🏆 1st PLACE WINNER - Reply Enterprise Challenge @ FIAP NEXT 2025 🏆 An end-to-end, production-grade predictive maintenance platform I built from scratch (investing hundreds of hours) to win Reply's annual enterprise challenge. This system uses a 12-agent event-driven architecture (FastAPI, Redis) and 17 ML models (trained on 6 real-world datasets like NASA, AI4I, XJTU) to predict equipment failures before they happen.

  • Business Value: Proven to reduce unplanned downtime by 40% and save R$ 100-500k per prevented failure.
  • Performance: Validated at 103.8 RPS with 3ms P99 latency under load.
  • Database: Achieved 37% faster dashboard queries using TimescaleDB continuous aggregates.
  • Stack: PythonFastAPITimescaleDBMLflowDockerAWSStreamlit (NDA Expired - Repository open for architectural review)

📑 Invoice Automation System (Full-Stack & Multi-Agent)

Solo Development | AI-powered invoice processing automation

  • Business Goal: To eliminate the slow, error-prone manual process of invoice handling for small to medium businesses.
  • Solution & Impact: Built a full-stack system that automates the entire invoice processing pipeline. By mapping the user journey and applying RAG for intelligent error handling, the system reduced manual processing time by over 85%.
  • Technologies: React.jsNext.jsTypeScriptFastAPILangChainRAGFAISSDockerAWS S3PostgreSQL

🌍 Guardian System: National Resilience Platform

🏆 Award Winner - FIAP Global Solution 2025.1

  • Business Goal: To create a predictive system to manage and mitigate large-scale national crises like natural disasters.
  • Solution & Impact: I single-handedly architected and developed this award-winning multi-agent platform. Five autonomous "Guardian" agents for different threat domains, with a fully functional MVP for fire risk prediction using real-time IoT sensor data.
  • Technologies: Agentic AIPythonFastAPIDockerMicroPythonESP32IoTApache Spark

🧬 AI Platform for Anti-Aging (Thesis Project)

Solo Development | Personalized anti-aging recommendation system

  • Business Goal: To create a scalable HealthTech platform that provides personalized, data-driven health recommendations, moving beyond generic advice.
  • Solution & Impact: Developing an AI platform focused on Explainable AI (SHAP) and secure deployment (JWT). The system translates complex epigenetic data (BioPython) into actionable health insights by analyzing genetic predispositions (SNPs) and lifestyle habits to generate personalized risk assessments.
  • Technologies: PyTorchScikit-learnBioPythonMLFlowSHAPDockerFastAPIReact

🚜 FarmTech Integrated Ecosystem (IoT & Edge AI)

End-to-End Architecture | Unified smart farming system integrating IoT, Cloud, and Hybrid AI

  • Business Goal: To optimize agricultural ROI by minimizing water usage and crop loss through real-time telemetry and automated decision-making.
  • Solution & Impact: A massive 6-module ecosystem combining Edge AI (YOLOv5) for pest detection and Cloud AI (GPT-4o) for insights. Features a custom Genetic Algorithm that solves the "knapsack problem" for crop allocation and a distributed ESP32 IoT network for predictive irrigation.
  • Technologies: PythonAWSIoT (ESP32)YOLOv5Genetic AlgorithmsOpenAI APISQLAlchemyStreamlit

📊 Elliott Wave ML Financial Analyzer (Scientific Project)

Student Lead & Architect | Automated B3 Stock Analysis & Prediction System

  • Business Goal: To automate the complex detection of Elliott Wave market patterns, creating a professional-grade technical analysis tool for the Brazilian Stock Exchange (B3).
  • Solution & Impact: Led the research and development of a full-stack ML system processing real-time market data. Built a custom feature engineering engine (24 technical indicators) and an MLOps pipeline (MLflow + AWS S3) to train and version Random Forest/SVM models. The system classifies market movements into 4 strategic categories via a Streamlit UI.
  • Technologies: PythonMLflowAWS S3DockerScikit-learnStreamlitFastAPITechnical Analysis

🧪 Admixture Automation Pipeline (Bioinformatics)

Lead Developer | High-performance bovine ancestry analysis pipeline for Embrapa

  • Business Goal: To solve computational bottlenecks in genomic ancestry analysis and democratize access to complex tools for researchers.
  • Solution & Impact: Architected a Nextflow automation pipeline that handles data conversion, Quality Control, and visualization. Introduced a parallelized Cross-Validation engine (reducing scan times drastically) and a Streamlit Web UI, allowing non-coders to run scientific-grade population structure analyses.
  • Technologies: NextflowPythonStreamlitRBioinformaticsParallel ComputingDocker




🌐 COMMUNITY PROJECTS & LEADERSHIP

Leading diverse teams to deliver production-ready AI platforms

As a Project Leader in the international SuperDataScience community, I led diverse teams of data scientists and ML engineers to deliver production-ready AI/ML platforms. I was responsible for aligning project priorities with stakeholders, defining KPIs, and managing deployment.

Leadership Experience: Project Lead for 2 projects | Project Member for 2 projects

GlucoTrack: Diabetes Risk Prediction Platform

Project Lead | Comprehensive diabetes risk assessment system using the CDC diabetes dataset

Led a diverse team of data scientists and ML engineers to deliver both beginner-friendly and advanced deep learning solutions.

Key Features: Built traditional ML models (Logistic Regression, Decision Trees) and advanced Feedforward Neural Networks with hyperparameter tuning. Includes model explainability tools and multiple deployment options.

Technologies: PythonScikit-learnDeep LearningStreamlitModel ExplainabilityHealthcare AIData Science

Live app: glucotrack.streamlit.app


MLPayGrade: ML Salary Prediction System

Project Lead | End-to-end salary prediction platform analyzing the 2024 machine learning job market

Coordinated a team of data scientists and ML engineers to build comprehensive solutions across multiple skill levels.

Key Features: Analyzes global salary trends and job feature impacts on compensation. Features both traditional ML pipelines and advanced deep learning on tabular data with embeddings and explainability.

Technologies: PythonScikit-learnDeep LearningTabular DataStreamlitJob Market AnalyticsData Science


EduSpend: Global Education Cost Prediction

Project Member | End-to-end machine learning platform to predict Total Cost of Attendance for international higher education

Key Features: Achieved a 96.44% R² score with an XGBoost Regressor, deployed via both a Streamlit web app and a FastAPI service, all containerized with Docker and automated with CI/CD.

Technologies: Scikit-learnXGBoostMLflowStreamlitFastAPIDockerCI/CDData Science


FinResearch AI: Multi-Agent Market Intelligence

Project Lead | Agentic AI system for automated institutional-grade financial research

Led the development of an autonomous multi-agent system that mimics a professional financial analyst team.

Key Features: Orchestrates 5 specialized agents (Researcher, Quant Analyst, Reporter) using Shared Vector Memory to scrape real-time news, calculate financial ratios, and synthesize findings into investment-grade reports. Implements "Advanced Track" architecture using CrewAI concepts.

Technologies: PythonCrewAIOpenAI AgentsRAGChromaDBStreamlitFinancial APIs


Smart Leaf: Deep Learning for Crop Disease

Project Member | Deep learning solution that classifies 14 different crop diseases across four species

Key Features: A Convolutional Neural Network (CNN) trained on my local machine, on over 13,000 images, using only modulerized python scripts (no notebooks), deployed via a user-friendly Streamlit interface for real-time predictions. Covers corn, potato, rice, and wheat diseases.

Technologies: Deep LearningComputer VisionCNNTensorFlowPyTorchStreamlitLocally Trained Neural Network





🔍 EXPLORE MORE PROJECTS


... and even more projects in my repositories, covering Data Science, Machine Learning, MLOps, LLMOps, IoT, AI engineering, bioinformatics, and more!

View All Repositories




🛠️ TECH STACK & TOOLS

My technical arsenal for building scalable solutions

AI & Machine Learning

Agentic AI & LLMs

Architecture, Backend & APIs

Databases & Data Engineering

Cloud & MLOps

Frontend & Visualization

Testing & Code Quality

IoT & Edge AI





🌍 GLOBAL COMMUNICATION

Languages & International Collaboration



Pinned Loading

  1. hermes_reply_award_winning_system hermes_reply_award_winning_system Public

    An award-winning, now open source platform for predictive industry maintenance. This project is built on a resilient and scalable stack including FastAPI, PostgreSQL/TimescaleDB , and Redis, all wi…

    Jupyter Notebook 1

  2. brim_agentic_invoice_system brim_agentic_invoice_system Public

    Technical test for Brim's AI Engineer role : implementation of a Multi-Agentic System for Invoice Automation. Deliveried on 02/02/2025.

    Python 1

  3. AgenticFlow AgenticFlow Public

    Before Google integrated Gemini into Google Workspace & Gmail, I built a multi-agent AI system using CrewAI, Flask, & React. Summarizes Gmail emails, generates replies with human review, extracts A…

    Python 4 1

  4. anti-aging-epigenetics-ml-app anti-aging-epigenetics-ml-app Public

    A thesis MVP for a personalized anti-aging system that analyzes genetic SNPs and lifestyle habits using ML models (Random Forest and Neural Networks) to provide risk assessments and actionable reco…

    Jupyter Notebook 1

  5. SDS-CP035-gluco-track SDS-CP035-gluco-track Public

    Forked from SuperDataScience-Community-Projects/SDS-CP035-gluco-track

    GlucoTrack is a machine learning and deep learning project focused on predicting a person’s risk level of diabetes

    Jupyter Notebook 1

  6. FarmTech_System FarmTech_System Public

    Sistema desenvolvido como projeto final do programa de Engenharia de IA da FIAP, combinando seis módulos principais para otimização de produção agrícola através de dados e inteligência artificial.

    Jupyter Notebook