Easy pose estimation with MMPose

7 min readFeb 12, 2024

In this case study, we will explore the process of creating a workflow for pose estimation using MMPose.

The Ikomia API simplifies the development of Computer Vision workflows and provides easy experimentation with different parameters to achieve optimal results.

Get started with Ikomia API

Using the Ikomia API, you can effortlessly create a workflow for pose estimation with MMPose in just a few lines of code.

To get started, all you need is to install the API in a virtual environment.

pip install ikomia

Some useful links at Ikomia API documentation and Ikomia API repo.

Run MMPose with a few lines of code

While OpenMMLab offers an excellent toolkit for Computer Vision, its documentation can sometimes be challenging to navigate for algorithm development.

Fortunately, these projects have been integrated into the Ikomia ecosystem, simplifying the installation process and making it easy to incorporate into your workflow.

You can also charge directly the open-source notebook we have prepared.

Understanding MMPose: pose estimation with OpenMMLab

In the world of Computer Vision, pose estimation aims to determine the position and orientation of predefined keypoints on objects or body parts. For instance, in human pose estimation, the goal is to locate specific keypoints on a person’s body, such as the elbows, knees, and shoulders.

MMPose, a part of the OpenMMLab’s ecosystem, is a cutting-edge library that provides tools and frameworks specifically designed for various pose estimation tasks.‍

What is OpenMMLab?

OpenMMLab is a community-driven open-source initiative that concentrates on advancing research in Computer Vision. Originating from the Multimedia Laboratory (MMLab) of the Chinese University of Hong Kong (CUHK), it has grown to encompass contributions from a broad spectrum of researchers, developers, and enthusiasts worldwide.‍

Objectives

The primary objectives of OpenMMLab are:

Promote Research: By providing high-quality codebases, OpenMMLab facilitates research in Computer Vision, enabling researchers to replicate, innovate, and advance the state-of-the-art.
Ease of Use: With modular and extensible designs, OpenMMLab projects make it easier for developers to experiment, customize, and deploy solutions.
Community Building: OpenMMLab fosters an active community by encouraging contributions, sharing of knowledge, and collaborative research.

Notable projects under OpenMMLab

OpenMMLab develops libraries to cater for specific tasks within Computer Vision.

MMDetection

A comprehensive toolbox designed for object detection and instance segmentation. It supports a plethora of models and is known for its flexibility and performance.

MMSegmentation

Dedicated to semantic segmentation tasks, this toolbox supports numerous state-of-the-art models and provides a platform for research and deployment in segmentation.

MMOCR

A comprehensive toolbox designed for optical character recognition (OCR) tasks. MMOCR provides tools for text detection, recognition, and understanding, supporting a wide range of state-of-the-art algorithms and models. It caters to a broad spectrum of OCR tasks such as scene text detection, recognition, and PDF/table understanding.

MMPose

As previously discussed, MMPose is geared towards pose estimation tasks, from human body keypoints to face and hand keypoints.

What is an example of pose estimation?

MMPose is a versatile toolbox built upon PyTorch that caters to multiple pose estimation tasks, including:

Human pose estimation

Face landmark detection

Hand keypoint detection

Animal pose estimation

Features of MMPose

One of the primary strengths of MMPose lies in its architectural design.

Modularity

MMPose separates its configuration into different modules, enabling researchers to mix and match components, for easy experimentation and deployment.

Multiple backbones supported

MMPose supports a variety of network backbones, such as ResNet, HRNet, and MobileNet, ensuring flexibility based on computational needs.

State-of-the-art models

The library provides pre-trained models that have been trained on standard datasets, enabling users to achieve competitive results out-of-the-box.

Easy customization

Researchers can easily extend the toolbox to cater to their specific requirements, whether it’s a new type of layer, loss, or even dataset.

What is pose estimation used for?

The potential of MMPose spans a wide range of sectors.

Healthcare

Rehabilitation: Monitoring and analyzing patients’ postures and movements aids in designing specialized physiotherapy and rehabilitation programs.
Disease diagnosis: Abnormal body movements or postures could indicate certain medical conditions, and pose estimation can assist in early detection.‍

Sports and fitness

Performance analysis: Coaches utilize pose estimation to analyze athletes’ techniques, identify areas for improvement, and prevent injuries.
Fitness training: Personal trainers ensure exercises are done correctly, or apps can guide users to maintain proper form during workouts.

Gaming and entertainment

Augmented reality (AR) and virtual reality (VR): Pose estimation is vital for creating immersive AR/VR experiences, especially for games that mimic users’ movements.
Dance Simulations: Games where players mimic on-screen dance moves.‍

Film and animation

Motion capture: Instead of using traditional bodysuits with markers, pose estimation can capture human movements to animate digital characters.

Security and surveillance

Suspicious activity detection: Identifying unusual postures or movements in public areas can be indicative of suspicious activities.

Retail and fashion

Virtual try-on: Using pose estimation, users virtually try on clothing, accessories, or makeup, getting a preview of how items might look on them.

Human-computer interaction

Gesture control: Users can interact with devices or applications using specific body movements or gestures, recognized through pose estimation.

Education and training

Physical training: In activities like yoga or dance, pose estimation provides feedback to learners about their posture and movement.
Interactive learning: Children can interact with educational content through body movements, making learning more engaging.

Robotics

Human-robot interaction: Robots understand human intentions better by interpreting their body posture and gestures, enabling more natural interactions.

Automotive

Driver monitoring: To enhance safety, pose estimation can monitor drivers for signs of drowsiness or distraction, alerting them if necessary.

These use cases are just the tip of the iceberg. As technology advances, the applications of pose estimation will continue to grow and diversify.

MMPose toolkit

MMPose from OpenMMLab offers an extensive toolkit for pose estimation tasks, combining flexibility, ease of use, and state-of-the-art performance.

Its modular architecture allows researchers and developers to customize and extend it to meet their specific needs. Whether you’re a beginner just starting with pose estimation or a seasoned researcher looking for a robust framework, MMPose is a worthy addition to your toolkit.

Step by step MMPose pose estimation with the Ikomia API

In this section, we will demonstrate how to utilize the Ikomia API to create a workflow for pose estimation with MMPose as presented above.

Step 1: import

from ikomia.dataprocess.workflow import Workflow
from ikomia.utils import ik
from ikomia.utils.displayIO import displaypy

The ‘Workflow’ class is the base object for creating a workflow. It provides methods for setting inputs (image, video, directory), configuring task parameters, obtaining time metrics, and retrieving specific task outputs, such as graphics, segmentation masks, and texts.
‘Ik’ is an auto-completion system designed to conveniently and easily access algorithms and settings.
The ‘display’ function offers a flexible and customizable way to display images (input/output) and graphics, such as bounding boxes and segmentation masks.

Step 2: create workflow

wf = Workflow()

We initialize a workflow instance. The “wf” object can then be used to add tasks to the workflow instance, configure their parameters, and run them on input data.

Step 3: add and connect MMPose

pose = wf.add_task(ik.infer_mmlab_pose_estimation(
        config_file = "configs/body_2d_keypoint/topdown_heatmap/coco/td-hm_vipnas-mbv3_8xb64-210e_coco-256x192.py",
        conf_thres = '0.5',
        conf_kp_thres = '0.3',
        detector = "Person"
        ),
        auto_connect=True
)

‘config_file’: path to the model configuration file.
‘conf_thres’: threshold of Non Maximum Suppression. It will retain Object Keypoint Similarity overlap when inferior to ‘conf_thres’.
‘conf_kp_thres’: threshold of the keypoint visibility. It will calculate Object Keypoint Similarity based on those keypoints with visibility higher than ‘conf_kp_thres’.‘detector’: object detector, ‘Person’, ‘Hand’, Face’.

You can get the full list of available config_file by running the following code snippet:

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_mmlab_pose_estimation", auto_connect=True)

# Get pretrained models
model_zoo = algo.get_model_zoo()

# Print possibilities
for parameters in model_zoo:
    print(parameters)

Step 4: apply your workflow to your image

wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_fireman.jpg")

You can apply the workflow to your image using the ‘run_on()’ function. In this example, we use the image url:

‍Step 5: display your results

Finally, you can display image results using the display function:

display(pose.get_image_with_graphics())

Example for hand pose estimation:

from ikomia.dataprocess.workflow import Workflow
from ikomia.utils import ik
from ikomia.utils.displayIO import display

wf = Workflow()

pose = wf.add_task(ik.infer_mmlab_pose_estimation(
        config_file = "configs/hand_2d_keypoint/topdown_regression/onehand10k/td-reg_res50_8xb64-210e_onehand10k-256x256.py",
        detector = "Hand"
        ),
        auto_connect=True
)

wf.run_on(url="https://www.science.org/do/10.1126/science.aac4663/full/sn-rockpaper-1644945533143.jpg")

display(pose.get_image_with_graphics())

‍

Build your own workflow with Ikomia

In this tutorial, we have explored the process of creating a workflow for pose estimation with MMPose.

For a deeper understanding of pose estimation, refer to our comprehensive OpenPose guide.

The Ikomia API streamlines the development of Computer Vision workflows, facilitating easy experimentation with various parameters to achieve the best outcomes.‍

For a comprehensive presentation of the API, consult the documentation. Additionally, browse the list of cutting-edge algorithms available on Ikomia HUB and explore Ikomia STUDIO, which provides a user-friendly interface with the same functionalities as the API.

References

[1] nba.com — im-back-michael-jordans-famous-return-basketball

[2] https://mmpose.readthedocs.io/en/1.x/user_guides/inference.html