Data Engineer · São Paulo, Brazil

Daniel
Alves de
Camargo

Building scalable data pipelines, AI-ready data layers, and intelligent integrations. Currently at Skyone, turning raw data into Gold.

Daniel Camargo

Data Engineer focused on AI-ready infrastructure.

Junior Data Engineer at Skyone Solutions. I specialize in building ETL pipelines and Data LakeHouse architectures using Python, PySpark, SQL, Databricks, and Azure. Beyond traditional data engineering, I integrate AI agents and LLM APIs into data workflows to drive automation and intelligence. With a strong background in Linux server management and infrastructure, I build efficient, scalable data layers that bridge the gap between raw data and production-ready AI models.

Currently studying for the Databricks Data Engineer Associate certification and pursuing a degree in Systems Analysis & Development at FATEC-SP (2023–2027).

3+
Years coding
2
Promotions at Skyone
40%
Report time saved via automation
Gold
Lake layer for AI agents

Mar 2026 — Present

Skyone Solutions

Full-time

Junior Data Engineer

  • Develop and maintain scalable data pipelines targeting Gold-layer consumption by AI models and agents
  • Implement contextual search skills and direct LLM integration into the Data Warehouse (MCP)
  • Build REST APIs with FastAPI and Python to serve AI models and automation solutions
  • Automate data flow integrations using Python, SQL, DuckDB, and JSONata

Apr 2025 — Feb 2026

Skyone Solutions

Internship → Promoted

Data Engineering Intern

  • Built data ingestion pipelines into a Data LakeHouse with Parquet validation and export
  • Modeled and manipulated data in DuckDB, JSON, Parquet, and CSV formats for AI Agent Gold layers
  • Created automated flows in Skyone Studio using REST/SOAP APIs
  • Automated data transformations with JSONata; gathered business requirements and rules

Jan 2023 — Jun 2024

Amilplast Variedades

Part-time

E-commerce Assistant

  • Automated financial reports in Excel VBA, reducing preparation time by 40%
  • Created optimization scripts and macros, increasing team productivity by 30%

Tech stack

Python
Pandas · NumPy · FastAPI · Selenium
SQL
PostgreSQL · DuckDB
Data Lake / ETL
Parquet · JSON · CSV · LakeHouse
Databricks
LakeFlow · Data Ingestion
APIs
REST · SOAP · FastAPI
Azure
Cloud data infrastructure
JSONata
Data transformation
AI / MCP
LLM integration · Agents
Git · Bash
Version control · Shell scripting
English
Advanced / Fluent

Selected work

$ psql -d crypto_db SELECT coin_id, price_usd, market_cap FROM market_data WHERE recorded_at > NOW() - INTERVAL '1h' ORDER BY market_cap DESC LIMIT 50; coin_id | price_usd | market_cap ————————————+————————————+—————————— bitcoin | 69420.00 | 1.36T ethereum | 3800.50 | 456B solana | 180.22 | 78B

CoinGecko → PostgreSQL Pipeline

ETL pipeline ingesting live cryptocurrency market data from the CoinGecko API into a PostgreSQL database. Automated scheduling, data validation, and structured storage for analysis.

SmartLift.AI

Mobile app combining real-time video analysis, AI-powered workout coaching, and social features. A personal side project exploring applied AI in fitness.

Low Code podcast

Low Code Podcast

Guest on the Low Code podcast by Skyone. We talked about data engineering, low-code platforms, and where AI fits into all of it.