Job Summary
We are looking for an engineer to build and run AI services that mix computer vision, CNNs, large-language models with RAG, and autonomous agents. You’ll take new ideas from our research group and turn them into fast, reliable micro-services on both edge devices and the cloud, expanding what the deviceWISE AI platform can do.
Objectives & Responsibilities
- Design and build GPU-accelerated micro-services for vision, LLM, and RAG workloads.
- Own the full model lifecycle: data capture, training, evaluation, packaging, and CI/CD deployment (Docker, Kubernetes, NVIDIA NIM).
- Optimize inference with TensorRT, ONNX-Runtime, quantization, batching, and Triton.
- Develop agent orchestration logic (LangChain, CrewAI) that chains tools, prompts, and APIs.
- Build and maintain automated visual inspection systems using computer vision models for quality control, defect detection, and anomaly identification in manufacturing and IoT environments
- Integrate high-throughput camera or sensor feeds with cloud knowledge bases for multimodal insights.
- Mentor junior engineers, lead code reviews, and document best practices (senior level).
Requirements & Qualifications
- 3+ years of software development experience with strong proficiency in Python and modern AI/ML frameworks (PyTorch, TensorFlow)
- Deep understanding of machine learning concepts including:
- Neural network architectures (CNNs, transformers, RNNs)
- Model training, validation, and optimization techniques
- Computer vision and natural language processing fundamentals
- Hands-on experience with ML infrastructure tools:
- Containerization (Docker, Kubernetes)
- Model serving platforms (FastAPI, Flask, or equivalent)
- GPU computing (CUDA, cuDNN) and performance optimization
- Proficiency in building and deploying microservices architectures
- Experience with cloud platforms (AWS, GCP, or Azure) and MLOps practices
- Strong foundation in software engineering principles, version control (Git), and collaborative development
Preferred Qualifications
- Experience with edge AI deployment and optimization for resource-constrained devices
- Familiarity with IoT protocols and time-series data processing
- Contributions to open-source AI/ML projects or research publications
- Experience with vector databases and embedding systems for RAG applications
Location
Remote or on site in Boca Raton, Florida or São Paulo, Brazil
Report job