INTRODUCTION
While working on a perception pipeline for an autonomous mobility platform, our engineering team faced a complex data migration challenge. The computer vision models relied heavily on highly accurate semantic segmentation to identify vehicles, pedestrians, and drivable surfaces. The data science team had initially annotated a large batch of images using an external open-source labeling tool, but as the dataset grew, the need for scalable, distributed quality assurance became critical.
To handle this, we decided to migrate the existing annotations into AWS SageMaker Ground Truth to create a labeling adjustment job. This would allow a specialized workforce to review and correct the existing labels. However, during the migration, we encountered a frustrating scenario: the migration pipeline successfully generated the AWS manifest files and the labeling job started without errors, but the existing annotations failed to render in the Ground Truth worker UI.
This silent failure in the MLOps pipeline halted our labeling QA process. This challenge inspired this article, detailing how we uncovered the hidden constraints of Ground Truth manifests and engineered a reliable data transformation layer so other teams can avoid similar bottlenecks.
PROBLEM CONTEXT
In machine learning architectures, data labeling is rarely a one-time event. It is a continuous cycle of prediction, annotation, and adjustment. For this specific use case in the automotive sector, we needed to pass previously labeled images back to human reviewers to refine the polygon edges around complex objects like partially obscured vehicles.
The original annotations were exported from the external tool in a JSON format. To import these into Ground Truth for an adjustment job, AWS requires an augmented manifest file (a JSON Lines file) that maps the source image to the existing annotation data. Our initial approach was to transform the external JSON into the structure AWS outlined in their documentation for custom output boxes.
When organizations look to optimize these workflows, they often hire software developer teams with specialized cloud expertise to build seamless integration layers. In our case, the integration layer formatted the manifest to reference the S3 locations of the original images and the newly mapped JSON annotation files.
WHAT WENT WRONG
After generating the manifest and launching the Ground Truth adjustment job, the workers reported that the UI loaded the base images, but the previous segmentation masks were completely missing. There were no CloudWatch errors indicating a structural failure in the job creation.
We examined a sample line from our generated manifest:
{"source-ref": "s3://autonomous-training-data/batch_01/img_0299.jpg", "labeling-qa-ref": "s3://autonomous-training-data/aws/annotations/img_0299.json", "labeling-qa-ref-metadata": {"internal-color-map": {"0": {"class-name": "BACKGROUND", "confidence": 0.9, "hex-color": "#ffffff"}, "1": {"class-name": "Car", "confidence": 0.9, "hex-color": "#2ca02c"}, "2": {"class-name": "Road", "confidence": 0.9, "hex-color": "#1f77b4"}, "3": {"class-name": "Person", "confidence": 0.9, "hex-color": "#ff7f0e"}}, "type": "groundtruth/semantic-segmentation", "job-name": "labeling-qa", "human-annotated": "yes", "creation-date": "2024-01-01T00:00:00.000000"}}The JSON structure perfectly matched the documentation for manifest formatting. The internal-color-map was correct, and the paths were valid. However, the architectural oversight lay in the specific modality requirements of Ground Truth.
While bounding box jobs utilize JSON files to define coordinates, semantic segmentation jobs in Ground Truth do not accept JSON files for existing annotations. Ground Truth strictly requires a 1-channel, 8-bit PNG image mask where each pixel’s integer value corresponds to the class ID defined in the internal-color-map. Because our labeling-qa-ref pointed to a .json file, the Ground Truth rendering engine silently failed to parse it as an image mask.
HOW WE APPROACHED THE SOLUTION
Identifying that the core issue was a file format mismatch, we needed to re-architect the data transformation step in our MLOps pipeline. The challenge was converting vector-based polygon annotations (from the external JSON) into rasterized PNG masks that strictly adhered to Ground Truth’s indexing rules.
We evaluated two approaches:
- Client-Side Rendering: Attempt to inject a custom UI template into Ground Truth that could parse the JSON polygons. We quickly discarded this as it bypassed the native, optimized segmentation tools Ground Truth provides to workers.
- Backend Rasterization Pipeline: Build a pre-processing step that reads the external JSON, uses computer vision libraries to draw filled polygons onto a blank numpy array, and saves the output as a 1-channel PNG mask.
We chose the backend rasterization approach for its robustness and compatibility with native AWS tools. Companies that hire python developers for data pipelines often utilize this approach to ensure data transformations happen synchronously and reliably before hitting the cloud infrastructure.
FINAL IMPLEMENTATION
We developed a Python-based preprocessing job that ran on AWS Batch prior to kicking off the Ground Truth job. The script performed the following steps:
1. Download the original image to determine the exact dimensions (Height, Width).
2. Create a 2D numpy array of zeros (representing the BACKGROUND class).
3. Parse the external JSON for polygon coordinates.
4. Use OpenCV (cv2.fillPoly) to draw the polygons onto the array, using the integer class ID (e.g., 1 for Car, 2 for Road) as the pixel color value.
5. Save the array as an indexed PNG utilizing the Python Imaging Library (PIL).
6. Upload the PNG to S3 and generate the corrected manifest.
Here is a sanitized snippet of the correct manifest format pointing to the newly generated PNG mask:
{"source-ref": "s3://autonomous-training-data/batch_01/img_0299.jpg", "labeling-qa-ref": "s3://autonomous-training-data/aws/masks/img_0299.png", "labeling-qa-ref-metadata": {"internal-color-map": {"0": {"class-name": "BACKGROUND", "confidence": 0.9, "hex-color": "#ffffff"}, "1": {"class-name": "Car", "confidence": 0.9, "hex-color": "#2ca02c"}, "2": {"class-name": "Road", "confidence": 0.9, "hex-color": "#1f77b4"}, "3": {"class-name": "Person", "confidence": 0.9, "hex-color": "#ff7f0e"}}, "type": "groundtruth/semantic-segmentation", "job-name": "labeling-qa", "human-annotated": "yes", "creation-date": "2024-01-01T00:00:00.000000"}}By enforcing this strict rasterization and ensuring the PNG files were saved as 8-bit (L mode) rather than RGB, the adjustment job successfully loaded the existing annotations in the Ground Truth UI, allowing workers to seamlessly adjust the edges.
LESSONS FOR ENGINEERING TEAMS
Migrating complex ML datasets between platforms requires strict adherence to undocumented or easily missed constraints. Here are the key takeaways for technical teams:
- Modality Dictates Format: Never assume uniform data formats across different ML tasks. Bounding boxes use JSON, but semantic segmentation almost universally relies on indexed PNG masks to maintain pixel-level mapping efficiency.
- Beware of Silent Failures: Managed services often fail silently when parsing malformed UI inputs. Always implement micro-batch testing (e.g., a 5-image labeling job) before running bulk migrations.
- Pixel Values Matter: When generating masks, do not use the RGB hex colors in the image itself. The PNG pixel value must be the literal integer of the class index (0, 1, 2, 3), which the UI then maps to the hex colors defined in the manifest metadata.
- Invest in Data Transformation Layers: Decouple your labeling tools from your model training pipelines. When you hire aws developers for mlops, ensure they build modular ETL steps that can adapt to changing input formats without breaking downstream systems.
- Validate Dimensions: Ensure that your generated PNG mask has the exact pixel dimensions as the source image. Even a 1-pixel discrepancy will cause Ground Truth to reject the mask rendering.
WRAP UP
Integrating diverse MLOps tools often reveals friction points in data serialization and format requirements. By transitioning our annotation migration from a JSON-based approach to a rigorous PNG rasterization pipeline, we successfully unlocked AWS Ground Truth’s adjustment capabilities for our semantic segmentation tasks, ensuring high-quality data for the autonomous driving models.
If your organization is navigating complex cloud architectures, building scalable ML pipelines, or looking to augment your engineering capabilities with experienced talent, contact us to explore how we can support your technical initiatives.
Social Hashtags
#AWS #SageMaker #GroundTruth #SemanticSegmentation #ComputerVision #MLOps #MachineLearning #ArtificialIntelligence #DataAnnotation #DeepLearning #CloudComputing #AWSCloud #PythonDeveloper #DataEngineering #AutonomousDriving #AIML #CloudArchitecture #TechBlog
Frequently Asked Questions
Semantic segmentation requires labeling at the individual pixel level. Storing thousands of pixel coordinates or complex polygon arrays in JSON becomes computationally expensive and bloated. A 1-channel PNG acts as a highly compressed, native 2D matrix where every pixel inherently maps to a class index, making rendering and training significantly faster.
No. Ground Truth requires an indexed, 8-bit grayscale (1-channel) PNG. If you upload an RGB image, the rendering engine will not be able to map the 3-channel values to the single integer keys defined in the internal-color-map.
Always create a subset of your dataset containing 3 to 5 records. Launch a private labeling job utilizing your own engineering team as the workforce. This allows you to verify that images load, masks align perfectly, and labels save correctly without incurring costs or delays associated with a full workforce deployment.
If the PNG mask contains pixel values (e.g., a pixel with value 4) that do not exist in the internal-color-map metadata, the UI will either fail to render the mask entirely or display an unclassified error region. The map must exhaustively cover every unique pixel value present in the mask file.
Success Stories That Inspire
See how our team takes complex business challenges and turns them into powerful, scalable digital solutions. From custom software and web applications to automation, integrations, and cloud-ready systems, each project reflects our commitment to innovation, performance, and long-term value.

California-based SMB Hired Dedicated Developers to Build a Photography SaaS Platform

Swedish Agency Built a Laravel-Based Staffing System by Hiring a Dedicated Remote Team

















