INTRODUCTION
During a recent engagement with a client in the HR Tech industry, our engineering team was tasked with modernizing a critical recruitment automation workflow. The system was designed to generate apprenticeship contracts and send draft emails to hiring managers for final approval. The architecture utilized a low-code orchestration platform to connect an internal ERP system with an enterprise Email API.
The workflow seemed flawless during initial testing: data was fetched, PDFs were generated, and drafts appeared in the managers’ inboxes. However, we soon noticed a subtle but unprofessional glitch. Whenever a candidate’s file contained accented characters (common in names or localized terms like “adhésion”), the email subject line in the draft would display garbled characters—known technically as mojibake—on the receiver’s end. For example, “adhésion” would render as “adhésion”.
This issue, while seemingly cosmetic, undermined trust in the automated system. It surfaced the challenges of manual MIME (Multipurpose Internet Mail Extensions) construction in integration layers. This article details how we diagnosed the root cause of the encoding failure and the architectural steps we took to resolve it permanently.
PROBLEM CONTEXT
The system was built to handle high volumes of candidate data. The orchestration middleware fetched thread IDs and candidate details, then constructed a raw RFC 2822 compliant message to post to the email provider’s drafts.create endpoint.
To support international characters, the team attempted to manually encode the headers. The email standard (RFC 5322) mandates that message headers contain only ASCII characters. To include non-ASCII text (like UTF-8 accents), one must use RFC 1342 “Encoded-Words” syntax, which looks like this:
=?charset?encoding?encoded-text?=
The initial implementation used JavaScript string manipulation to construct these headers manually. The code looked logically correct: it buffered the subject string, converted it to Base64, and wrapped it in the requisite tags.
However, despite the headers appearing correct in logs, the receiving email clients (webmail, mobile, and desktop clients) consistently displayed the Subject line as Latin-1 decoded garbage (mojibake) instead of the intended UTF-8 text.
WHAT WENT WRONG
The symptom—é displaying as é—is a classic encoding artifact. It occurs when UTF-8 bytes (0xC3 0xA9) are misinterpreted by the rendering engine as ISO-8859-1 (Latin-1) characters. In Latin-1, 0xC3 maps to à and 0xA9 maps to ©.
We identified that while the manual implementation successfully created a Base64 string, the surrounding MIME structure was fragile. Manual string concatenation in integration scripts often overlooks subtle requirements of the MIME standard, such as:
- Line Folding: RFC 2822 limits line length to 998 characters (and recommends 78). Long Base64 strings must be split properly.
- Header Structure: Mixing
Content-Transfer-Encodingheaders meant for the body within the top-level header block can confuse parsers. - API Pre-processing: When sending a
rawstring to an Email API, the service often parses and re-serializes the message. If the headers are slightly malformed, the API’s intake parser may default to Latin-1, effectively “baking in” the error before it even reaches the inbox.
The manual approach, while lightweight, lacked the robustness to guarantee that the Email API would interpret the character set correctly during ingestion.
HOW WE APPROACHED THE SOLUTION
Our investigation moved from checking the raw bytes to evaluating the architectural approach to message construction. We considered two paths:
1. Debugging the String Construction: We could try to patch the manual code—ensuring strictly 7-bit headers, enforcing CRLF line breaks, and validating input buffers. However, this leaves the codebase vulnerable to future edge cases (e.g., longer subject lines requiring folding).
2. Implementing a Standard MIME Library: The more mature engineering decision was to replace manual string concatenation with a battle-tested library. In the Node.js ecosystem, libraries like nodemailer or mailcomposer handle the immense complexity of MIME generation automatically.
We chose the second approach. Companies looking to hire software developers often value this distinction: the ability to recognize when not to write custom code for standardized protocols. By offloading the MIME generation to a library, we ensured compliance with RFC specifications regarding folding, boundary generation, and charset definition.
FINAL IMPLEMENTATION
We refactored the integration node to use a dedicated MIME builder. If you are working in a restricted environment (like a serverless function or low-code node) where full libraries are heavy, you can use a lightweight alternative, but for this example, we demonstrate the robustness of using a standard composer pattern.
The following logic replaced the manual string concatenation:
// Generic implementation using a standard mail composition library
const MailComposer = require('nodemailer/lib/mail-composer');
const inputData = {
subject: "CDO adhésion d'une apprentie au 1er septembre",
to: "recipient@example.com",
htmlBody: "Please review the attached contract.
”
};
const mail = new MailComposer({
from: “me”,
to: inputData.to,
subject: inputData.subject, // Library automatically handles UTF-8 and Q/B encoding
html: inputData.htmlBody,
textEncoding: ‘base64’
});
mail.compile().build((err, messageBuffer) => {
if (err) throw err;
// The buffer contains the fully compliant raw MIME message
// Convert to URL-Safe Base64 for the API
const rawMessage = messageBuffer.toString(‘base64’)
.replace(/+/g, ‘-‘)
.replace(///g, ‘_’)
.replace(/=+$/, ”);
// Now POST ‘rawMessage’ to the Email API endpoint
});
Why This Worked
The library automatically detected the non-ASCII characters in the subject. It chose the optimal encoding method (Base64 vs. Quoted-Printable) and formatted the header correctly:
Subject: =?UTF-8?Q?CDO_adh=C3=A9sion_d'une_apprentie...?=
Critically, it ensured that the overall message structure—boundaries, content types, and transfer encodings—was unambiguous. When the Email API received this payload, it correctly identified the UTF-8 charset, preventing the Latin-1 fallback loop that caused the mojibake.
LESSONS FOR ENGINEERING TEAMS
This challenge reinforced several key principles for teams building enterprise integrations:
- Avoid Manual MIME Construction: Email standards are deceptively complex. Writing your own raw message builder is an anti-pattern that leads to fragility.
- Validate Upstream Encodings: Ensure your integration platform isn’t decoding strings incorrectly before they reach your script. Test inputs with special characters early.
- Use Established Libraries: When you hire Node.js developers for automation, ensure they rely on ecosystem standards (like Nodemailer) rather than reinventing wheels, which reduces technical debt.
- Understand API Sanitization: Third-party APIs (like Google Workspace or Office 365) often sanitize inputs. If your input is ambiguous, they will make assumptions (often incorrect ones) about character sets.
- Test Across Clients: Always verify rendering on multiple clients (Web, Mobile, Desktop). What looks right in a JSON log may render poorly in Outlook or Thunderbird.
WRAP UP
Handling character encodings correctly is a hallmark of engineering maturity. While it is tempting to manually patch strings in low-code environments, relying on standardized libraries ensures your communication infrastructure remains professional and reliable. Whether you need to hire backend developers for API integration or build a complete dedicated team, prioritizing standards-compliance avoids embarrassing production glitches.
If you are facing complex integration challenges or need to scale your engineering capacity, contact us to discuss how our dedicated teams can support your roadmap.
Social Hashtags
#APIIntegration #EmailAutomation #NodeJSDevelopment #BackendEngineering #MIMEEncoding #EnterpriseSoftware #SoftwareArchitecture
Struggling with complex API integrations or encoding issues in your automation workflows?
Talk to Workflow Experts
Frequently Asked Questions
This happens when a UTF-8 string is interpreted as Latin-1 (ISO-8859-1). The rendering client sees the two bytes for 'é' (0xC3, 0xA9) and displays them as individual characters 'Ã' and '©'. It usually indicates missing or malformed charset headers.
Base64 ensures data survives transmission, but it doesn't define the character set. You must still explicitly label the data as UTF-8 within the MIME headers so the receiving client knows how to decode the Base64 bytes back into characters.
Technically yes, but it is discouraged. You must handle line folding (78 chars), CRLF line breaks, and proper boundary generation manually, which is error-prone and hard to maintain.
Headers containing non-ASCII characters must follow RFC 1342. They should be encoded as =?UTF-8?B?Base64String?= or =?UTF-8?Q?QuotedPrintable?=. Using a library to generate the raw string is safer than manual construction.
Success Stories That Inspire
See how our team takes complex business challenges and turns them into powerful, scalable digital solutions. From custom software and web applications to automation, integrations, and cloud-ready systems, each project reflects our commitment to innovation, performance, and long-term value.

California-based SMB Hired Dedicated Developers to Build a Photography SaaS Platform

Swedish Agency Built a Laravel-Based Staffing System by Hiring a Dedicated Remote Team
















