Data-driven framework for Sequential Decision Making in Operations Research

About this project

Project description

The project aims at addressing a class of optimization problems, referred to as parameterized Sequential Decision Making (para-SDM) problems, and will develop a comprehensive, flexible, data-driven framework for them. Para-SDM problems cover a vast range of network logistics and planning problems in the area of operations research (OR); for instance, they model facility location with path optimization (FLPO), vehicle routing, sensor network design, manufacturing process parameter optimization, last mile delivery, and industrial robot-resource allocation problems.

The para-SDM problems include large subclasses of problems such as Markov Decision Processes (MDPs), reinforcement learning (RL), clustering, resource allocation, scheduling, and routing problems. Conceptually, these problems require simultaneously (a) determining the shortest path, and (b) allocating resources in a network; while incorporating application-specific capacity and exclusion constraints, and while respecting the dynamical evolution of the network.

Para-SDMs can also be viewed as a generalization of MDPs in the following context – just as at the heart of MDPs lies the shortest path or routing problems (with inherent system dynamics), the central aspect of para-SDM is to determine simultaneously the shortest path problems in conjunction with facility allocation (parameter). Just as in MDPs, there are many variants to the basic para-SDM formulation – for instance, inclusion of various types of constraints on routes and resources such as capacity constraints on nodes or, on the links between nodes in a route, communication or topological (inclusion/exclusion) constraints on the nodes, inclusion of dynamics on the nodes where the dynamics may be modelled as deterministic or stochastic, or inclusion of uncertainty in the model parameters.

This project will involve:

(1) Developing a flexible framework that addresses the class of optimization problems that fall under the ambit of para-SDM tasks.
(2) Abstracting a learning paradigm – parameterized reinforcement learning (para-RL) – from the above framework that learns solution to OR problems using data.
(3) Demonstrating its capabilities on OR problems, and benchmarking against existing methods.


Outcomes and Deliverables

(1) An efficient, entropy based, data-driven framework for para-SDM problems.
(2) Improvement over the current state-of-the-art, data-based solutions to OR problems.
(3) High quality research papers in leading AI/ML/Operations Research journals and conferences.
(4) Contribution to open-source algorithms that solve popular OR problems such as job shop scheduling, and planning in collaborative robotics.
(5) Benchmarking against state-of-the-art model-free and model-based solutions to OR problems.

Information for applicants

Essential capabilities

Calculus, Linear Algebra, Probability and Statistics

Desireable capabilities

Mathematical optimization, Machine Learning, Programming, Reinforcement Learning

Expected qualifications (Course/Degrees etc.)

BTech (or equivalent/higher degree) in relevant engineering stream OR MSc in Maths/Operations Research/Statistics or other relevant streams

Project supervisors

Principal supervisors

UQ Supervisor

Dr Nan Ye

School of Mathematics and Physics
IITD Supervisor

Assistant professor Amber Srivastava

Department of Mechanical Engineering
Additional Supervisor

Assistant professor Prashant Palkar

Department of Mechanical Engineering