Partners: Aalto University (PI: Prof. Ville Kyrki), LUT-University, Oulu University, GIM Oy, Sandvik Oy, Ponsse Oy, Raute Oy, Mantsinen Oy
Role: Project manager (Aalto), considerable involvement in proposal preparation, writing, and coordination
Budget: 2124k€ total, 671k€ Aalto
Period: May 2022–Apr 2024
abstract
The SANTTU project focuses on the development of systems for operators of work machines and heavy industrial machines. The aim is to simplify the control and operation of the machines with semi-autonomous operator assistance systems. This can be used to reduce the cognitive stress on the operator as well as the stress on the machine, thus extending machine life, availability, and productivity. By combining physics-based (real-time) simulation, digital twins and artificial intelligence technologies - semi-automated systems can be created. We develop a model-based, AI-assisted solution that provides collision prevention, stress reduction, improved accuracy, automation of work routines, and a human-centric user interface design. These lower the competence requirements for the efficient use of the machine, which expands the realistically targeted market, especially in fast-growing market areas.
related publications
journal articles
IJRR
Survey of maps of dynamics for mobile robots
Tomasz Piotr Kucner, Martin Magnusson, Sariah Mghames, Luigi Palmieri, Francesco Verdoja, and 6 more authors
Robotic mapping provides spatial information for autonomous agents. Depending on the tasks they seek to enable, the maps created range from simple 2D representations of the environment geometry to complex, multilayered semantic maps. This survey article is about maps of dynamics (MoDs), which store semantic information about typical motion patterns in a given environment. Some MoDs use trajectories as input, and some can be built from short, disconnected observations of motion. Robots can use MoDs, for example, for global motion planning, improved localization, or human motion prediction. Accounting for the increasing importance of maps of dynamics, we present a comprehensive survey that organizes the knowledge accumulated in the field and identifies promising directions for future work. Specifically, we introduce field-specific vocabulary, summarize existing work according to a novel taxonomy, and describe possible applications and open research problems. We conclude that the field is mature enough, and we expect that maps of dynamics will be increasingly used to improve robot performance in real-world use cases. At the same time, the field is still in a phase of rapid development where novel contributions could significantly impact this research area.
conference articles
IROS
Jointly Learning Cost and Constraints from Demonstrations for Safe Trajectory Generation
Shivam Chaubey, Francesco Verdoja, and Ville Kyrki
In 2024 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), Oct 2024
Nominated for the Best Safety, Security, and Rescue Robotics Paper award sponsored by the International Rescue System Initiative (IRSI) at the 2024 IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS)
Learning from Demonstration allows robots to mimic human actions. However, these methods do not model constraints crucial to ensure safety of the learned skill. Moreover, even when explicitly modelling constraints, they rely on the assumption of a known cost function, which limits their practical usability for task with unknown cost. In this work we propose a two-step optimization process that allow to estimate cost and constraints by decoupling the learning of cost functions from the identification of unknown constraints within the demonstrated trajectories. Initially, we identify the cost function by isolating the effect of constraints on parts of the demonstrations. Subsequently, a constraint leaning method is used to identify the unknown constraints. Our approach is validated both on simulated trajectories and a real robotic manipulation task. Our experiments show the impact that incorrect cost estimation has on the learned constraints and illustrate how the proposed method is able to infer unknown constraints, such as obstacles, from demonstrated trajectories without any initial knowledge of the cost.
MFI
Localization Under Consistent Assumptions Over Dynamics
Matti Pekkanen, Francesco Verdoja, and Ville Kyrki
In 2024 IEEE Int. Conf. on Multisensor Fusion and Integration for Intelligent Systems (MFI), Sep 2024
Accurate maps are a prerequisite for virtually all mobile robot tasks. Most state-of-the-art maps assume a static world; therefore, dynamic objects are filtered out of the measurements. However, this division ignores movable but non- moving—i.e., semi-static—objects, which are usually recorded in the map and treated as static objects, violating the static world assumption and causing errors in the localization. This paper presents a method for consistently modeling moving and movable objects to match the map and measurements. This reduces the error resulting from inconsistent categorization and treatment of non-static measurements. A semantic segmentation network is used to categorize the measurements into static and semi-static classes, and a background subtraction filter is used to remove dynamic measurements. Finally, we show that consis- tent assumptions over dynamics improve localization accuracy when compared against a state-of-the-art baseline solution using real-world data from the Oxford Radar RobotCar data set.
MFI
Object-Oriented Grid Mapping in Dynamic Environments
Matti Pekkanen, Francesco Verdoja, and Ville Kyrki
In 2024 IEEE Int. Conf. on Multisensor Fusion and Integration for Intelligent Systems (MFI), Sep 2024
Grid maps, especially occupancy grid maps, are ubiquitous in many mobile robot applications. To simplify the process of learning the map, grid maps subdivide the world into a grid of cells whose occupancies are independently estimated using measurements in the perceptual field of the particular cell. However, the world consists of objects that span multiple cells, which means that measurements falling onto a cell provide evidence of the occupancy of other cells belonging to the same object. Current models do not capture this correlation and, therefore, do not use object-level information for estimating the state of the environment. In this work, we present a way to generalize the update of grid maps, relaxing the assumption of independence. We propose modeling the relationship between the measurements and the occupancy of each cell as a set of latent variables and jointly estimate those variables and the posterior of the map. We propose a method to estimate the latent variables by clustering based on semantic labels and an extension to the Normal Distributions Transform Occupancy Map (NDT-OM) to facilitate the proposed map update method. We perform comprehensive map creation and localization ex- periments with real-world data sets and show that the proposed method creates better maps in highly dynamic environments compared to state-of-the-art methods. Finally, we demonstrate the ability of the proposed method to remove occluded objects from the map in a lifelong map update scenario.
workshop articles
ICRA
Jointly Learning Cost and Constraints from Demonstrations for Safe Trajectory Generation
Shivam Chaubey, Francesco Verdoja, and Ville Kyrki
May 2024
Presented at the “Towards Collaborative Partners: Design, Shared Control, and Robot Learning for Physical Human-Robot Interaction (pHRI)” workshop at the IEEE Int. Conf. on Robotics and Automation (ICRA)
Learning from Demonstration (LfD) allows robots to mimic human actions. However, these methods do not model constraints crucial to ensure safety of the learned skill. Moreover, even when explicitly modelling constraints, they rely on the assumption of a known cost function, which limits their practical usability for task with unknown cost. In this work we propose a two-step optimization process that allow to estimate cost and constraints by decoupling the learning of cost functions from the identification of unknown constraints within the demonstrated trajectories. Initially, we identify the cost function by isolating the effect of constraints on parts of the demonstrations. Subsequently, a constraint leaning method is used to identify the unknown constraints. Our approach is validated both on simulated trajectories and a real robotic manipulation task. Our experiments show the impact that incorrect cost estimation has on the learned constraints and illustrate how the proposed method is able to infer unknown constraints, such as obstacles, from demonstrated trajectories without any initial knowledge of the cost.
ICRA
Evaluating the quality of robotic visual-language maps
Matti Pekkanen, Tsvetomila Mihaylova, Francesco Verdoja, and Ville Kyrki
May 2024
Presented at the “Vision-Language Models for Navigation and Manipulation (VLMNM)” workshop at the IEEE Int. Conf. on Robotics and Automation (ICRA)
Visual-language models (VLMs) have recently been introduced in robotic mapping by using the latent representations, i.e., embeddings, of the VLMs to represent the natural language semantics in the map. The main benefit is moving beyond a small set of human-created labels toward open-vocabulary scene understanding. While there is anecdotal evidence that maps built this way support downstream tasks, such as navigation, rigorous analysis of the quality of the maps using these embeddings is lacking. In this paper, we propose a way to analyze the quality of maps created using VLMs by evaluating two critical properties: queryability and consistency. We demonstrate the proposed method by evaluating the maps created by two state-of-the-art methods, VLMaps and OpenScene, using two encoders, LSeg and OpenSeg, using real-world data from the Matterport3D data set. We find that OpenScene outperforms VLMaps with both encoders, and LSeg outperforms OpenSeg with both methods.
ICRA
Modeling movable objects improves localization in dynamic environments
Matti Pekkanen, Francesco Verdoja, and Ville Kyrki
May 2024
Presented at the “Future of Construction: Lifelong Learning Robots in Changing Construction Sites” workshop at the IEEE Int. Conf. on Robotics and Automation (ICRA)
Most state-of-the-art robotic maps assume a static world; therefore, dynamic objects are filtered out of the measurements. However, this division ignores movable but non- moving, i.e., semi-static objects, which are usually recorded in the map and treated as static objects, violating the static world assumption and causing errors in the localization. This paper presents a method for modeling moving and movable objects to match the map and measurements consistently. This reduces the error resulting from inconsistent categorization and treatment of non-static measurements. A semantic segmentation network is used to categorize the measurements into static and semi- static classes, and a background subtraction-based filtering method is used to remove dynamic measurements. Experimental comparison against a state-of-the-art baseline solution using real-world data from the Oxford Radar RobotCar data set shows that consistent assumptions over dynamics increase localization accuracy.
IROS
Generating people flow from architecture of real unseen environments
Francesco Verdoja, Tomasz Piotr Kucner, and Ville Kyrki
Oct 2022
Presented at the “Perception and Navigation for Autonomous Robotics in Unstructured and Dynamic Environments (PNARUDE)” workshop at the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS)
Recent advances in multi-fingered robotic grasping have enabled fast 6-Degrees-Of-Freedom (DOF) single object grasping. Multi-finger grasping in cluttered scenes, on the other hand, remains mostly unexplored due to the added difficulty of reasoning over obstacles which greatly increases the computational time to generate high-quality collision-free grasps. In this work we address such limitations by introducing DDGC, a fast generative multi-finger grasp sampling method that can generate high quality grasps in cluttered scenes from a single RGB-D image. DDGC is built as a network that encodes scene information to produce coarse-to-fine collision-free grasp poses and configurations. We experimentally benchmark DDGC against the simulated-annealing planner in GraspIt! on 1200 simulated cluttered scenes and 7 real world scenes. The results show that DDGC outperforms the baseline on synthesizing high-quality grasps and removing clutter while being 5 times faster. This, in turn, opens up the door for using multi-finger grasps in practical applications which has so far been limited due to the excessive computation time needed by other methods.