Curriculum Vitae
Education
New York University (NYU), Ph.D., Computer Science (GPA: 3.9/4.0)
Jan 2020 – May 2024
Jan 2020 – May 2024
- Thesis: Efficient Algorithms for Correlated Data Discovery
- Awards: Dissertation Award and Qualification Exam Award (2 × )
- Advisor: Juliana Freire
- Coursework: Deep Learning, Information Visualization, Application Security, Computational Geometry
Federal University of Minas Gerais (UFMG), M.Sc., Computer Science
Mar 2011 – Mar 2013
Mar 2011 – Mar 2013
- Thesis: Learning to Schedule Web Page Updates Using Genetic Programming
- Advisor: Nivio Ziviani
- Coursework: Information Retrieval, Machine Learning, Data Mining, Web Data Management
Federal Institute of Piauí (IFPI), Analysis and Development of Systems (CGPA: 9.1/10)
Jan 2008 – Jan 2011
Jan 2008 – Jan 2011
- Capstone: A distributed system for indexing and retrieval of text documents (full-text search) and images (content-based retrieval)
- Advisor: Valéria Costa
Professional Experience
New York University (NYU), Research Engineer
Mar 2015 – Present
Mar 2015 – Present
- Conducted research and engineering for various multi-year R&D programs funded by DARPA (Memex, D3M, and ASKEM), ARPA-H (BDF), and NSF, in research areas such as IR, ML, data management, and interactive systems
- Recruited multiple MS student interns and mentored their projects; collaborated with and mentored multiple research engineers and Ph.D. students across several research projects
- Ideated and co-authored research grant proposals that attracted over a million dollars in research funding
- Designed and developed multiple open-source systems:
- Visus: An interactive AutoML system enabling subject-matter experts to construct ML models without coding
- Auctus: A dataset search engine focused on retrieving datasets for improving ML models via data augmentation
- bdi-kit: A Python library using language models to support data integration tasks
- Harmonia: An agent for automating data integration tasks by combining LLM-based reasoning, user interaction, and the bdi-kit library
- ACHE: A focused web-crawler for domain-specific search
- Collaborated with partners in academia, government, and industry to deploy research systems in production
Zunnit Technologies, Software Engineer
Mar 2013 – Feb 2015
Mar 2013 – Feb 2015
- Contributed to a full rewrite of the company's recommendation platform (REST APIs, admin dashboard, and recommendation algorithms), increasing its scalability to handle millions of recommendation requests
- Ensured high availability of the platform, handling operational aspects including deployment automation, monitoring, software releases, and debugging
- Researched new algorithms for news recommendation based on semantic topic modeling and learning-to-rank models, as well as multi-armed bandits algorithms
Laboratory for Treating Information (UFMG), Research Assistant
Mar 2011 – Feb 2013
Mar 2011 – Feb 2013
- Developed a new learning-based algorithm for scheduling web page visits for large-scale web crawlers (2 papers published at SPIRE and the Information Retrieval Journal)
- Contributed to the implementation and refactoring of InWeb Crawler, a C++ web crawler capable of collecting and processing millions of web pages per hour
Research Laboratory in Information Systems (IFPI), Research Assistant
Sep 2009 – Jan 2011
Sep 2009 – Jan 2011
- Refactored code and improved Opala, a library for text and image indexing and retrieval based on Apache Lucene
- Developed a web-based digital library system (frontend and backend) for cataloging and searching documents
- Designed and implemented Jazida, a scalable distributed system for content indexing based on Opala/Lucene
Netsoft Tecnologia em Sistemas Ltda., Web Developer Intern
Aug 2009 – Nov 2010
Aug 2009 – Nov 2010
- Full-stack development of a modern web-based point-of-sales and inventory management system to replace the company's legacy system
Awards
2024
Pearl Brownstein Doctoral Research Award.
For "Ph.D. students whose doctoral research shows the greatest promise"
2023
SIGMOD DEEM 2023 Best Paper Award.
Paper: "Using Pipeline Performance Prediction to Accelerate AutoML Systems"
2022
Deborah Rosenthal, MD Award.
For "outstanding performance on the Ph.D. qualifying examination".
2021
Google Research Collabs Fellowship.
Grant for funded research in collaboration with a research scientist at Google Research