Intro
tf2 (TransForm version 2) is a ROS library that provides an easy way to calculate transformations between different coordinate frames (e.g., the robot’s base frame, camera frame, map frame).
tf2 is the second generation of the transform library, which lets the user keep track of multiple coordinate frames over time. tf2 maintains the relationship between coordinate frames in a tree structure buffered in time, and lets the user transform points, vectors, etc between any two coordinate frames at any desired point in time.
geometryCoordinateFrameConventions
A robotic system typically has many 3D coordinate frames that change over time, such as a world frame, base frame, gripper frame, head frame, etc. tf2 keeps track of all these frames over time, and allows you to ask questions like:
- Where was the head frame relative to the world frame, 5 seconds ago?
 - What is the pose of the object in my gripper relative to my base?
 - What is the current pose of the base frame in the map frame?
 
Understanding ROS Transforms
https://foxglove.dev/blog/understanding-ros-transforms
In robotics, a frame refers to the coordinate system describing an object’s position and orientation, typically along x, y, and z axes. ROS requires each frame to have its own unique frame_id, and a typical scene consists of multiple frames for each robot component (i.e. limb, sensor), detected object, and player in the robot’s world.A scene always has one base frame – usually named world or map – that is an unmoving constant. All other frames in the scene – including the robot’s own frame, which ROS typically calls base_link – are children to this base frame, and are positioned relative to this parent in some way.
Calculating an object’s position using transforms
tf2 under the hood example
Each child frame has a transform that represents its position vector and rotation in relation to its parent frame:
Parent:
base_link
sensor_link– Position: , Rotation: ( radians)arm_base_link– Position: , Rotation: ( radians)Parent:
arm_base_link
arm_end_link– Position: , Rotation: (no rotation)
Calculation
To deduce the position of
arm_end_linkin thebase_linkframe, we need to follow the chain of transformations from thearm_end_linkto thebase_link. Here’s how to approach it step by step using the given transforms:1. Transformation from
base_linktosensor_link:
- Position: in the
 base_linkframe.- Rotation: (or radians) in the
 base_linkframe.This means that
sensor_linkis positioned at inbase_link, and its orientation is rotated frombase_link.2. Transformation from
base_linktoarm_base_link:
- Position: in the
 base_linkframe.- Rotation: (or radians) in the
 base_linkframe.This means that
arm_base_linkis positioned at inbase_link, and it has a ( radians) rotation with respect tobase_link.3. Transformation from
arm_base_linktoarm_end_link:
- Position: in the
 arm_base_linkframe.- Rotation: (no rotation).
 This means that
arm_end_linkis positioned at in thearm_base_linkframe and has no rotation.Now, we need to compute the overall position of
arm_end_linkin thebase_linkframe by following these transformations step by step:Step 1: Find the position of
arm_end_linkin thearm_base_linkframe:The position of
arm_end_linkin thearm_base_linkframe is:
- Position: in
 arm_base_link.Step 2: Apply the transform from
base_linktoarm_base_link:The position of
arm_base_linkinbase_linkis , and it is rotated by ( radians) with respect to thebase_linkframe. To apply this transform, we first need to rotate and then translate the position ofarm_end_link$ (1, 1)$ in thearm_base_link$ frame.
Rotation: Rotate by ( radians). Use a rotation matrix to rotate the point :
Applying the rotation to the position :
Translation: After rotation, translate by the position of
arm_base_linkin `base_link(1, -1)$:This gives us the final position of
arm_end_linkin thebase_linkframe.





tf2 is the second generation of the transform library, which lets the user keep track of multiple coordinate frames over time. tf2 maintains the relationship between coordinate frames in a tree structure buffered in time, and lets the user transform points, vectors, etc between any two coordinate frames at any desired point in time.

