Anas Dorbani's profile picture

Anas Dorbani

PhD Student in Computer Engineering

A PhD student at Polytechnique Montreal specializing in the intersection of AI and data systems. My research focuses on multimodal data integration, tabular understanding, and enhancing database systems with large language models. I am passionate about building the next generation of intelligent data systems.

Data & AI SystemsMultimodal Data IntegrationTabular Understanding
Research

Published Papers

Beyond Quacking: Deep Integration of Language Models and RAG into DuckDB

Anas Dorbani, Sunny Yasser, Jimmy Lin, Amine Mhedhbi

Very Large Data Base Endowment (VLDB) - Demonstration Track2025
PDF
Experience

Work Experience

Oracle Labs

Casablanca, Morocco

February 2024 - July 2024

Research Assistant

Data Integration Team

Automated schema generation for Oracle's Financial Crimes & Compliance systems, enhancing data processing. Fine-tuned 7B models to optimize schema and handle abbreviated column names. Created a framework to evaluate schema generation and data integration accuracy. Improved metadata consistency from 0.4 to 0.6, boosting data interpretability. Optimized output parsing for better data flow and results with 7B models.

Oracle Labs

Casablanca, Morocco

June 2023 - August 2023

Research Assistant

AutoMLx Team

Enhanced machine learning explainability for the AutoMLx project by optimizing LFI/GFI explainers, reducing their processing time by 80% and improving inference speed. Streamlined memory usage from 20GB to 4GB, lowering operational costs for explanation services. Achieved 83% code coverage to ensure reliability and maintainability of explainability features. Collaborated with cross-functional teams to deliver scalable, high-performance ML explainability solutions within AutoMLx

National University of Rabat

Rabat, Morocco

July 2022 - August 2022

Research Assistant

Valuation and Transfer Management

Engineered a deep learning model to predict RFID pricing by scraping specifications and market data. Deployed the solution on GCP using Docker for scalable performance and built a Django web application to streamline data collection and real-time model testing.

Recognition

Awards & Grants

VLDB Travel Grant

Grant
2025

Very Large Data Base Endowment Inc.

Funding support for students, researchers, and faculty to attend the VLDB 2025 conference in London, covering travel, lodging, and free registration to promote participation in database research.

Academic Journey

Education

PhD in Computer Engineering

Polytechnique Montreal

Montreal, Canada

2025 - Present

Focus

Researching multimodal data integration and tabular understanding, with a focus on large language models and database systems.

B.Sc. in Computer Science

National School of Computer Science And System Analysis

Rabat, Morocco

2019 - 2024

Thesis

Developed an automated approach to schema generation and data processing for financial compliance systems using advanced language models, improving metadata consistency and enhancing overall data integration and interpretability.

Industry Collaboration

Oracle Labs

Portfolio

Projects

FlockMTL

FlockMTL

Present

DBMS extension integrating LLM and RAG into OLAP systems. Developed FlockMTL from infrastructure design to code implementation and optimization. Designed custom map and reduce functions to integrate advanced workflows into relational database systems. Implemented dynamic batching over tuples to improve query execution efficiency.

OpenHands

OpenHands

August 2024

Platform for software development agents. As a core maintainer, I helped with reimplementing the SWE agent and fixing its benchmark to improve performance and reliability. Assisted with issue resolution and reviewed pull requests to maintain project quality.

SecureStream

Feb 2024

A network security project that employs machine learning and real-time traffic monitoring to detect anomalies in network data. Powered by the CSE-CIC-IDS2018 dataset and cicflowmeter, it enables swift identification of potential threats, enhancing overall network security.