8 tips to use the String class efficiently

In a previous article, we saw what heap fragmentation is and why it’s essential to reduce it.

In most Arduino programs, the principal source of fragmentation is the String class. Because it uses the heap behind the scenes, you don’t realize that allocation happens.

No, this article is not another rant about the String class; on the contrary, we’ll see how we can improve our programs while keeping the String class.

A person applies plaster on a brick wall

Why bother?

At first, I wanted to write an article to explain why the String class is terrible, and how easy it is to avoid it. However, I realized that advising you to rewrite a huge part of your code is not very pleasant nor helpful.

Indeed, the String class has some design flaws, but it’s not all bad. First, it provides an intuitive syntax that is familiar to Java and C# programmers, so it certainly helps beginners. Second, the String class provides a convenient means to bring a Flash string into the RAM temporarily, as we’ll see.

Avoiding heap allocation is not always possible. Many Arduino cores (most notably the ones for ESP8266 and ESP32) put a very low limit on the stack size, so it’s not possible to put large strings there. In this case, using the String class makes perfect sense.

Finally, some libraries (for example ESP8266WebServer) force us to use the String class. Yes, it drives me crazy too, but I still prefer reusing the library than writing my own.

For all these reasons, we need ways to make efficient programs while still using the String class.

What I mean by “efficient”

Before showing you how to improve the efficiency of the program, I need to clarify what I mean. In this article, “being efficient” means:

using as little RAM as possible
using as few CPU cycles as possible
reducing heap fragmentation

To reduce the RAM usage, we ensure that only one copy of a string is present at any given time.

To use fewer CPU cycles, we avoid moving bytes from one place to another.

To reduce the heap fragmentation, we must reduce the number of allocation and use blocks of constant size.

Code instrumentation

To benchmark the efficiency of the tips presented here, I modified the original String class to make it logs each allocation and each duplication. For example, it logs the calls to malloc(), free(), and strcpy(). It also watches the calls to realloc() which are very important because this function both allocates and moves bytes.

As usual, you’ll find the code samples on GitHub. In this article, each code snippet starts with a number indicating to the corresponding function in the sample project.

Tip 1: Initialize from Flash string

When you define a string literal, the compiler adds it to the “global” area¹, meaning that all the characters are always in the RAM whether you’re using them or not.

Here is an example:

// #1
void sayHello() {
   String s = "hello";
   send(s);
}

The six bytes composing the string “hello” (including the terminator) are present in RAM during the whole execution of the program. When sayHello() runs, the constructor of String makes a copy in the heap.

This program is inefficient because we have two copies of the same strings in the RAM. To improve it, we can prevent the compiler from putting the literal in the RAM, and to keep it in the Flash memory. The PROGMEM attribute allows doing that, but since it’s not very convenient, Arduino provices the F() macro.

Here is the fixed version:

// #2
void sayHello() {
   String s = F("hello");
   send(s);
}

Now, the 6 bytes are copied in the RAM only when sayHello() executes.

Flash strings can significantly improve the efficiency of your program, but they have a few gotchas, as we saw in a previous article.

Tip 2: Use `c_str()` instead of `toCharArray()`

Suppose, you have a String s and need to call a function send() that takes an argument of type const char*. There are two ways to convert a String to a char pointer.

The first is to call toCharArray() which copies all the characters to an array:

// #3
char tmp[32]
s.toCharArray(tmp, sizeof(tmp));
send(tmp);

The second way is to call c_str() which returns a pointer to the internal buffer of the String:

// #4
send(s.c_str());

By calling c_str() we avoid a costly duplication, but we must be careful with the returned pointer because it remains valid only if the String is unchanged. Also, be aware that, c_str() can return nullptr.²

Tip 3: Pass by reference

Compare these two function declarations:

// #5
void passByValue(String s);

// #6
void passByRef(const String& s);

We can call both functions with a String instance, but the first one receives a copy, whereas the second receives a reference to the original.

Passing objects by reference is much more efficient than passing by value; there are very few exceptions to this rule.³

Tip 4: If you must pass by value, move the `String`

If you’re stuck with a library that forces you to pass a String by value, you can avoid the duplication by using the “move semantics” of C++11.

Suppose you constructed a String s; the normal way to pass s to a function is to write:

// #5
passByValue(s);

However, if you know you’ll not use the s after the call, you can “move” the object:

// #7
passByValue(move(s));

move() replicates the standard std::move() that Arduino lacks. This function converts s to a String&&, allowing to call the “move-constructor” of String. This constructor rips off the content of s to create the argument for passByValue(), so we cannot use s after that.

Tip 5: Mutate a `String` instead of creating temporaries

The + operator offers a nice syntax for composing Strings. Unfortunately, each call to this operator creates one or more temporary (read “hidden”) instances of String.

For example, the following line requires six allocations in the heap:

// #8
String path = String("/api/") + RESOURCE + "?key=" + API_KEY;

To reduce the number of allocations, we need to avoid the + operator. Instead, we can call the += operator which modifies the left-side argument instead of creating a new String.

By rewriting the line above, we can reduce the number of allocations to four:

// #9
String path("/api/");
path += RESOURCE;
path += "?key=";
path += API_KEY;

Tip 6: Call `reserve()`

We can further reduce the number of allocations by calling reserve() which allocates a buffer of the specified size. If we reserve enough room, the String doesn’t need to reallocate the buffer when we call +=.

The following version makes only two allocations:

// #10
String path;
path.reserve(63);
path += "/api/";
path += RESOURCE;
path += "?key=";
path += API_KEY;

Reducing the number of allocations is not the only benefit of reserve(); it also allow us to control the size of the allocation (64 in the snippet above). That’s excellent because using blocks of constant size reduces the heap fragmentation, as we saw in the previous article.

Tip 7: Pass a null string to the constructor

In the snippet above, we used the default constructor of String, which constructs the instance from an empty string.

In other words:

String s;

is identical to

String s("");

Unfortunately, this constructor allocates a block in the heap, even if we don’t use this block at all. It allocates only one byte, so it’s not a big deal, but we can avoid it by passing a null to the constructor.

The following version requires only one allocation:

// #11
String path((char *)0);
path.reserve(63);
path += "/api/";
path += RESOURCE;
path += "?key=";
path += API_KEY;

NOTE 1: this tip is irrelevant in Arduino cores that implement SSO (Small String Optimization), such as ESP8266’s and ESP32’s.

NOTE 2: some Arduino cores (notably Teensy’s) already pass NULL as the initial value.

Tip 8: Compose the string at compile time

The C++ language allows concatenating string literals at compile time; they simply need to be separated by spaces.

For example:

const char* s = "hello" "world";

is identical to:

const char* s = "helloworld";

To leverage this feature with our previous example, we must substitute RESOURCE and API_KEY with two string literals. The simplest way is to define two macros that the preprocessor expands before the compiler parses the code.

// #12
#define RESOURCE "outdoor_temp"
#define API_KEY "0123456789"
String path = "/api/" RESOURCE "?key=" API_KEY;

By doing all more work at compile time, we reduce the work at runtime.

Of course, this trick only works when you compose a String with constants, but they don’t have to be string literals. Indeed, there are many tricks that the preprocessor can do, but that’s the topic for another article.

Conclusion

More than ever, I invite you to check out the code samples on GitHub because you’ll see the calls to malloc(), realloc(), free() and strcpy() for each of the examples above. I included the output of the program in the README file, so you don’t have to run the program yourself.

The implementation of String is part on the Arduino core, so theoretically the tips shown in this article could be irrelevant for some boards. However, because all the cores were forked from the original one for AVR, they are very likely to have the same String implementation.

I’ll see you soon with another article; meanwhile, check out that repository !

Depending on the platform, the .data segment or the .rodata segment contains the string literals. ↩
This behavior differs for std::string which never returns nullptr ↩
See Item 41 of Effective Modern C++ by Scott Meyers ↩