Wednesday, 24 March 2010
A little while ago, I delivered my master thesis and finished my Master of Technology in Computer Science degree at the Norwegian University of Science and Technology. The thesis has the title iAD: Query Optimization in MARS, and is about building a query optimizer for Fast's (now a subsidiary of Microsoft, they bought them) new Enterprise Search solution.

The report itself can be found here: MasterReport.pdf, and the abstract is included below:

This document is the report for the authors' joint effort in researching and designing a query optimizer for Fast's next-generation search platform, known as MARS. The work was done during our master thesis at the Department of Computer and Information Science at the Norwegian University of Science and Technology, spring 2009.

MARS does not currently employ any form of query optimizer, but does have a parser and a runtime system. The report therefore focuses on the core query optimizing aspects, like plan generation and optimizer design. First, we give an introduction to query optimizers and selected problems. Then, we describe previous and ongoing efforts regarding query optimizers, before shifting focus to our own design and results.

MARS supports DAG-structured query plans for more efficient execution, which means that the optimizer must do so too. This turned out to be a greater task than what it might seem like -- since we must use algorithms that greatly differ from the optimizers we were familiar with. The optimizer also needed to be extensible, including the ability to deal with future query operators, as well as supporting arbitrary cost models.

During the course of the master's thesis, we have laid out the design of an optimizer that satisfies these goals. The optimizer is able to recognize common subexpressions and construct DAGs from non-DAG inputs. Extensibility is solved by loose coupling between optimizer components. Rules are used to model operators, and the cost model is a separate, customizable component. We have also implemented a prototype that demonstrates that the design actually works.

The optimizer itself is designed as a separate component, not tied up to MARS. We have been able tin inject it into the MARS query pipeline and run queries end-to-end with optimization enabled, improving query evaluation time. For now, the project depends on MARS assemblies, but reusing it for another engine and algebra is entirely feasible.

posted on Wednesday, 24 March 2010 04:07:19 (W. Europe Standard Time, UTC+01:00)  #    Comments [2]