Automated Sequence Planning for Complex Robotic Assembly with Physical Feasibility

ICRA 2024

1MIT CSAIL 2Autodesk Research 3University of Waterloo 4Texas A&M University
Work partially done while interning at Autodesk Research

ASAP efficiently generates physically feasible assembly plans for complex and general-shaped assemblies that are executable by robots.


The automated assembly of complex products requires a system that can automatically plan a physically feasible sequence of actions for assembling many parts together. In this paper, we present ASAP, a physics-based planning approach for automatically generating such a sequence for general-shaped assemblies. ASAP accounts for gravity to design a sequence where each sub-assembly is physically stable with a limited number of parts being held and a support surface. We apply efficient tree search algorithms to reduce the combinatorial complexity of determining such an assembly sequence. The search can be guided by either geometric heuristics or graph neural networks trained on data with simulation labels. Finally, we show the superior performance of ASAP at generating physically realistic assembly sequence plans on a large dataset of hundreds of complex product assemblies. We further demonstrate the applicability of ASAP on both simulation and real-world robotic setups.


Assembly Sequence Planning

Disassembly Tree Search

We apply the idea of assembly-by-disassembly to obtain the assembly sequence from the reverse order of its disassembly sequence with much less complexity.

We formulate the disassembly sequence planning as a tree-search framework where established techniques can be applied to search for feasible disassembly sequences with a constrained evaluation budget.

A feasible tree expansion is conditioned on many constraints, which we take into consideration:

  • Path constraint: A penetration-free path must exist to disassemble the part.
  • Stability constraint: The remaining subassembly must be stable after the part being disassembled. Otherwise, the unstable parts must be held by grippers or supporting surface.
  • Execution constraint: The assembly process must be executable given the available robotic manipulators.

Disassembly tree search paradigm.
An example disassembly tree where nodes represent partial assemblies and edges represent feasible (green) and infeasible (red) disassembly actions.

Part Selection

Graph neural network structure for part selection.
Network architecture for learning part disassembly priority. Given an assembly as input, a graph neural network is used to predict the next part to be removed.

Given a certain sub-assembly during the tree search, to decide which part to disassemble, we devise different selection strategies for search efficiency:

  • Geometric heuristic: We observe that parts on the outside of an assembly are usually easier to remove due to fewer precedence constraints, therefore we prioritize selecting parts from outside in. It is also possible to proritize parts with small volume or few adjacent parts.
  • Learning-based: We implement a graph neural network that takes as input the assembly graph and outputs a probability distribution over all parts, indicating the likelihood of each part being next in the disassembly sequence. The model is trained on thousands of assemblies labeled automatically using physics-based simulation.

Pose Selection

We search for the most stable pose when the assembly is placed on a support surface, with guidance from a quasistatic pose estimator.

We additionally propose a pose reuse technique that reduces reorientations of the planned sequences to keep visual consistency and improve the success rate.

Quasistatic stable poses as pose search candidates.
Quasistatic stable poses of a beam assembly with ranked probabilities of landing in each pose if randomly dropped onto the ground.

Gravitational Stability Check

Efficiency gain from the proposed stability check algorithm.
Accuracy and speed up of the proposed stability check algorithm compared to the combinatorial ground truth (TP = true positive, TN = true negative, FN = false negative).

To verify the physical-feasibility of the assembly sequence, we developed an efficient gravitational stability check algorithm for multi-part contact-rich assemblies. Given an assembly under a specific pose, the algorithm outputs the set of parts needed to be held to make the assembly stable.

Our proposed stability check algorithm iteratively determines unstable parts to be held until the assembly becomes stable, which shows an order of magnitude speed up, avoids the combinatorial complexity and has a reasonable accuracy.

Large-Scale Benchmark


To benchmark the performance of our approach, we build a large complex 3D assembly dataset, consisting of 2,146 assemblies ranging from 3 to 50+ complex parts. 240 assemblies were selected for the test set for benchmarking, and the rest were used to train the part-selection model.

Dataset statistics.
Dataset statistics.
Overview of our training dataset.
Examples of our training dataset.

Overview of our test dataset.
Examples of our test dataset.


We evaluate the percentage of assemblies from the test dataset that can be disassembled using ASAP, given a specific computational budget. We compare the performance of ASAP against a naive Random Permutation baseline, a Genetic Algorithm baseline and Assemble Them All. The comparisons are shown based on different feasibility evaluation budgets (low = 50, high = 400) and different numbers of parts can be held.

Quantitative compairson results.

Results show that ASAP outperforms all three baselines by a significant margin. Previous methods also don't take into account gravitational stability, producing assembly sequences with floating parts and apparently unstable poses. Note that we conduct quantitative comparisons without constraints from robotic manipulators, which isolates the impact of planning strategies on assembly success while abstracting away the nuances of robotic hardware and specialized tools.

Qualitative comparison results.

Robotic Demonstration

We integrate ASAP with a robotic setup targeting real-world deployment with a UFACTORY xArm 7 robotic arm and a Robotiq 2F-140 gripper. The motion of the robotic arm and gripper is governed by grasp planning, inverse kinematics and collision detection, and the orientation of the rotary table is also planned by a heuristic to optimize arm reachability. The integration is fully extendable to other choices of robot arms and grippers. To our knowledge, ASAP is the first method to generate physically-feasible robotic assembly plans by only taking the assembly and robot specifications as input without additional human guidance. (Note that only the moving arm is shown in the videos.)

Here we demonstrate the sim-to-real transfer on a real-world hardware setup with a 3D printed beam assembly with 5 parts, where we show a step-by-step correspondence between simulated ASAP plans and real hardware execution. To aid in the spatial localization of assembly components, we use a laser-cut placemat that allows the robot to determine the precise positioning of parts, thereby reducing potential assembly errors. The direct sim-to-real transfer is non-trivial because due to tight millimeter-level clearances in assembly joints for stability and inherent errors in part fabrication and arm localization, which can be made more robust by incorporating vision or force feedback and adaptive manipulation skills.

Simulation Step 1
Simulation Step 2
Simulation Step 3
Simulation Step 4
Real Execution

Additional Experiments

Learning Assembly From Human Intuition

We are motivated by the apparent ability of humans to intuit the correct disassembly order of assemblies. Therefore, we develop a novel data labeling tool and training pipeline to learn part selection strategy from human annotation. Human annotators are shown an interactive 3D view of an assembly and asked to select which part to remove next.

We label both the full assembly and partial assemblies that form a fully connected graph. Each assembly is independently labeled by three annotators and the majority vote label is used. To support the large-scale nature of this labeling job, the labels were crowd-sourced using the Amazon Mechanical Turk workforce.

Due to the high expense and noisy nature of human annotations, we only label a small subset of the full assembly dataset and we fail to achieve the same level of performance as the simulated data. However, we believe that this is a promising direction for future work.

Online labeling platform for learning assembly from human annotated data.
Our interactive data labeling tool. Users are presented with a full assembly or a fully-connected partial assembly and asked to click on the part they would remove next while keeping the rest of the assembly intact.


      title={ASAP: Automated Sequence Planning for Complex Robotic Assembly with Physical Feasibility}, 
      author={Yunsheng Tian and Karl D. D. Willis and Bassel Al Omari and Jieliang Luo and Pingchuan Ma and Yichen Li and Farhad Javid and Edward Gu and Joshua Jacob and Shinjiro Sueda and Hui Li and Sachin Chitta and Wojciech Matusik},