Curriculum Vitae
Education
        
             
            New York University (NYU), Ph.D., Computer Science (GPA: 3.9/4.0)
        
Jan 2020 – May 2024
    Jan 2020 – May 2024
- Thesis: Efficient Algorithms for Correlated Data Discovery
- Advisor: Juliana Freire
- Coursework: Deep Learning, Information Visualization, Application Security, Computational Geometry
- Awards: 
                - ACM SIGMOD Jim Gray Doctoral Dissertation Award – Honorable Mention
- NYU Pearl Brownstein Doctoral Research Award
- NYU Deborah Rosenthal, M.D. Outstanding Quals Performance Award
 
        
             
            Federal University of Minas Gerais (UFMG), M.Sc., Computer Science
        
Mar 2011 – Mar 2013
    Mar 2011 – Mar 2013
- Thesis: Learning to Schedule Web Page Updates Using Genetic Programming
- Advisor: Nivio Ziviani
- Coursework: Information Retrieval, Machine Learning, Data Mining, Web Data Management
        
             
            Federal Institute of Piauí (IFPI), Analysis and Development of Systems (CGPA: 9.1/10)
        
Jan 2008 – Jan 2011
    Jan 2008 – Jan 2011
- Capstone: A distributed system for indexing and retrieval of text documents (full-text search) and images (content-based retrieval)
- Advisor: Valéria Costa
Professional Experience
        
             
            New York University (NYU), Research Engineer
        
Mar 2015 – Present
    Mar 2015 – Present
- Conducted research and engineering for various multi-year R&D programs funded by DARPA (Memex, D3M, and ASKEM), ARPA-H (BDF), and NSF, in research areas such as IR, ML, data management, and interactive systems
- Recruited multiple MS student interns and mentored their projects; collaborated with and mentored multiple research engineers and Ph.D. students across several research projects
- Ideated and co-authored research grant proposals that attracted over a million dollars in research funding
- Designed and developed multiple open-source systems:
                - Visus: An interactive AutoML system enabling subject-matter experts to construct ML models without coding
- Auctus: A dataset search engine focused on retrieving datasets for improving ML models via data augmentation
- bdi-kit: A Python library using language models to support data integration tasks
- Harmonia: An agent for automating data integration tasks by combining LLM-based reasoning, user interaction, and the bdi-kit library
- ACHE: A focused web-crawler for domain-specific search
 
- Collaborated with partners in academia, government, and industry to deploy research systems in production
        
             
            Zunnit Technologies, Software Engineer
        
Mar 2013 – Feb 2015
    Mar 2013 – Feb 2015
- Contributed to a full rewrite of the company's recommendation platform (REST APIs, admin dashboard, and recommendation algorithms), increasing its scalability to handle millions of recommendation requests
- Ensured high availability of the platform, handling operational aspects including deployment automation, monitoring, software releases, and debugging
- Researched new algorithms for news recommendation based on semantic topic modeling and learning-to-rank models, as well as multi-armed bandits algorithms
        
             
            Laboratory for Treating Information (UFMG), Research Assistant
        
Mar 2011 – Feb 2013
    Mar 2011 – Feb 2013
- Developed a new learning-based algorithm for scheduling web page visits for large-scale web crawlers (2 papers published at SPIRE and the Information Retrieval Journal)
- Contributed to the implementation and refactoring of InWeb Crawler, a C++ web crawler capable of collecting and processing millions of web pages per hour
        
             
            Research Laboratory in Information Systems (IFPI), Research Assistant
        
Sep 2009 – Jan 2011
    Sep 2009 – Jan 2011
- Refactored code and improved Opala, a library for text and image indexing and retrieval based on Apache Lucene
- Developed a web-based digital library system (frontend and backend) for cataloging and searching documents
- Designed and implemented Jazida, a scalable distributed system for content indexing based on Opala/Lucene
        
             
            Netsoft Tecnologia em Sistemas Ltda., Web Developer Intern
        
Aug 2009 – Nov 2010
    Aug 2009 – Nov 2010
- Full-stack development of a modern web-based point-of-sales and inventory management system to replace the company's legacy system
Awards
        2024
        
             
            Pearl Brownstein Doctoral Research Award.
        
        For "Ph.D. students whose doctoral research shows the greatest promise"
    
    
        2023
        
             
            SIGMOD DEEM 2023 Best Paper Award.
        
        Paper: "Using Pipeline Performance Prediction to Accelerate AutoML Systems"
    
    
        2022
        
             
            Deborah Rosenthal, MD Award.
        
        
            For "outstanding performance on the Ph.D. qualifying examination".
        
    
    
        2021
        
             
            Google Research Collabs Fellowship.
        
        Grant for funded research in collaboration with a research scientist at Google Research
    
