Supply Chain Data Competition

SCAIL is pleased to announce its first Data-driven Supply Chain Competition! We are running our first competition during INCOM 2024, in collaboration with the Centre for Autonomous and Intelligent Systems (CAIS) at the University of Huddersfield, and Supply Chain and Operations Management at The Berlin School of Economics and Law. The competition comprises of two segments described as below. Contestants can choose to participate in both or either one of the segments, as individuals or as part of a team. Further information can be found on: https://www.incom2024.org/data-challenge/


Segment 1: Logistics Competition

 

The final dataset for logistics competition is released!!! Please download the dataset from: https://drive.google.com/drive/folders/1bqRzwpE2krHl9nswa2CrSD5CHqaLTRGx?usp=sharing

 

 

Dataset Description:

The dataset for this segment consists of 100 instances of the Multi-Depot Capacitated Vehicle Routing Problem (MDCVRP), which are randomly generated within a 1000 x 1000 Euclidean space, with configurations based on carefully selected parameters.

 

Properties for each instance include:

  • name: The unique identifier for the instance
  • depots: The number of depots in this instance
  • dimensions: The size of this problem, representing the sum of the number of depots and the number of customers, which is indicative of the computational complexity of the instance.
  • edge_weight_type: The distance type between two nodes, which is 2D euclidean distance in this competition.
  • capacity: The capacity of the vehicles allowed in this instance.
  • node_coord_section: All nodes (depots and customers) in this instance. The depots are listed first, followed by the customers. Each node includes three properties:
    • id: The identifier of the node
    • x: The x-coordinate of the node
    • y: The y-coordinate of the node
  • demand_section: The demand of each node, in which depot nodes has demand of zero.
  • depot_section: The list of depot nodes, which is the same as the first part of “node_coord_section”.

 

Instance files:

For ease of access, each instance is available in two formats: plaintext (.vrp) and structured text-based file (.yaml). Participants can choose either format for loading the instance, as both contain the same information. Each instance is also plotted, with depots represented as squares and customers as solid circles. These visualisations are saved as .png files, providing a clear illustration of the instance node distribution.

 

Sample Instances:

We generated a collection of 20 MDCVRP instances using the same instance generator as for the final competition instances, but with different randomisation seeds. Participants can use these sample instances for testing their solvers.

 

Vehicles:

The vehicles in each instance are homogeneous, all having the same capacity as specified by the “capacity” property in the instance file. Each vehicle is assigned to a single depot, from which it departs and to which it returns after completing its journey.

There is no limitation on how many vehicles a solution can use, but this variable obviously influences the objective function (minimising the total travel distance of all vehicles).

 

Solutions and Evaluations:

Participants should develop their solvers using their preferred programming languages and open source libraries to provide solutions for the 100 instances. The performance of the solver will be scored by :

  • The total number of instances it can solve,
  • The quality of the solutions (the shorter the total routes, the better),
  • The computational efficiency, and
  • The vehicle space utilisation, the larger the better.

 

An effective solution would include the below information:

  • Routes of all vehicles, including the load and distance of each route
  • Total distance and total load of all routes
  • Vehicle capacity
  • Machine specification (CPU and memory size) and the computation time (all solutions will be evaluated on the same machine)

 

Please ensure your solution files are well-formatted with plenty of comments, and saved as individual plaintext files. Each file should be named as “[instance_id].sol” or just “sol__[instance_id].txt”, where [instance_id] is the actual identifier of the instance.

 

An example solution (which may not be an optimal solution), along with its corresponding instance, is provided for reference.

 

The euclidean distance between two nodes (x1, y1) and (x2, y2) will be a floating-point number. For simplicity, we just consider its integer part in the competition. Please round the number down to the nearest integer.

 

Solution Submission:

Please package all your solutions into a .zip file and send it to: scail.incom2024@gmail.com by the deadline.

 

The 20 sample instances and the example solution can be downloaded from: https://drive.google.com/drive/folders/1c7yC_DdyLxmwF5i5E6WAO8J9zVRPkHQT?usp=sharing

 

 

 

Segment 2: Supply Chain Delay Prediction

 

The final dataset for supply chain competition is released!!! Please download the dataset from: https://drive.google.com/drive/folders/10Ft8CC7z1UToc_V8tnhQaaibFPX2WL_B?usp=sharing

 

 

Dataset Description:

The dataset for this segment is synthetically generated based on a real dataset. The original dataset has first been preprocessed, which involved cleansing, renaming variables, and removing variables that directly indicate delivery outcomes. This was then fed into our data generation model for training and sampling. The resulting dataset consists of around 150k samples, each containing 40 feature variables and 1 outcome variable.

 

We also generated an example dataset that contains 15k samples using the same data generation model. This example dataset and corresponding variable descriptions can be downloaded via the link: https://drive.google.com/drive/folders/1pupvDGYWw0J9ixRqlJLFTJzm6v8gmgwa?usp=sharing

 

Participants can develop and pre-test their models using this example dataset. The final dataset will be released one day before the conference.

 

Note that the problem has been desgined as a three-label classification problem. Your goal is to create a machine learning based classification model that is able to predict the delivery outcome, where orders arrive early, on time, or with a delay.

 

Solution Submission and Evaluation:

The final dataset will be split into two datasets, 80% released for your model training (model training dataset), 20% kept for final model scoring (model scoring dataset). f1_score with weighted average will be used as the performance metrics for scoring the submitted models. An example code_snippet in python is provided for reference.

 

Solutions can be either submitted to us for verification and testing, or you can download the scoring data for testing on your machine and submit your f1_score on the model scoring dataset. In this case we will need to check your code to validate your f1_score.

 

The team with the highest f1_score on the model scoring dataset will be the final winner.

 

All submissions should be sent to scail.incom2024@gmail.com.  For any questions, please feel free to contact Dr Liming Xu (lx249@cam.ac.uk)

 

 

 

 

 

For further information please contact:

Prof Alexandra Brintrup

T: +44 (0)1223 764615

E: ab702@cam.ac.uk

Share This