The term embedded system is rather vague, so it is perhaps prudent to narrow our field a little. Primarily, we are focusing on very small devices, that is development on small DSP and microcontroller based systems running primarily out of the CPU's internal memory. We shall discuss this scale of system mainly because use of C++ on a 32-bit ARM processor running linux is hardly contentious. These systems are about as small as you get and often bytes/clock cycles will count. Conventional wisdom generally dictates that we should avoid the use of C++ in such systems entirely, and stick to C and assembly. Personally I don't agree, but certainly C++ must be used with care and we will not build such a program using the same style and idioms we would for a desktop application.
Lets start out with a few reasons not to use C++. Firstly and most importantly you need to be sure there is good compiler support on your platform. Some processors will have no C++ compiler in which case you are stuffed. But even on those that do have support you need to be careful as a poor compiler is almost worse than no compiler at all.
Also worth consideration is that in the embedded world you will likely work with a lot of electronic engineers many of them will have a basic to fairly decent grasp of C but little to no understanding of C++. This can be problematic when handing off your code for maintenance or if you team involves several sparkies who will be contributing on the software. Remember you should always assume that the person maintaining your code is prone to fits of rage, owns an axe and knows where you live. Make sure choosing C++ is not likely to become a maintenance nightmare for one of these axe murders.
Well now that we have established that we have a fine upstanding compiler and a well educated team of crack programmers, we can proceed to why C++ is so awesome.
Templates, what's not to love about templates? They let us retain a high degree of flexibility in the final product with very little runtime cost. Additionally, they open up great additional code reuse options. Often I find myself reusing generic buffers, etc even within the same project. Templates are probably my favourite thing about using C++ for embedded work. And finally, templates let us do a lot more at compile time than we can in C, and the more done at compile time the less we have to squeeze into our tiny system at runtime. Interestingly there was a subset of C++ doing the rounds a while ago called Embedded C++ which completely omitted templates, madness I tell you.
C++ provides better language feature for arranging projects, especially as they get larger. I don't know whether it is the brainwashing of being educated through the staunchly object orientated part of history but I do like to have objects at my disposal whilst coding. Whilst somewhat moving out of vogue objects do map very well onto certain problems. C++ was written to include both procedural and object orientated styles and feel that is how it should be written, using objects when they enhance elegance of the code.
I make rather heavy use of static classes in my embedded code. A class containing only static methods and data behaves much like a collection of C functions, hence very little unnecessary overhead. Static classes also map well onto a lot of common embedded tasks. For example wrapping a hardware peripheral in an interface is a fairly common task, naturally the one peripheral only needs a single object to represent it. You could achieve the same effect using plain C, but I like having the opportunity to override parts of my interface to provide alternative functionality.
Avoid exceptions and RTTI, they both add a lot of additional code footprint and are rarely necessary. Additionally, coming back to those poor sparkies it is worth noting that they have probably not heard of RAII.
Whilst I heaped praise on templates earlier they do have a darker side; templates can spew out quite the mountain of code if not used with care. Even worse this can be dependant on the compiler. You need to try to avoid generating additional code that is not required, for example the following small code snippet will often generate 2 copies of the Bar function even though both are identical. Clearly this should be taken care of by the optimizer, but a good amount of compilers will miss this one.
Code that doesn't need to be duplicated can be filtered out into a non-templated base class, doing the optimizer's work for it. This allows us to keep the templated code, but only generate one copy of the Bar function.
Try to push as much calculation to compile time as you can, templates are awesome for this. And keep a close eye on what your compiler is generating, sometimes you might need to do a little bit of work on its behalf. This actually comes up fairly often for me and I might try to write a post on annoying optimization differences in the future.
Use polymorphism sparingly, virtual functions add both a small runtime and code footprint overhead. Whilst the penalty is small make sure that it is worth it, again as with templates keep a good eye on the compiled output to make sure nothing crazy it happening.
One thing I have been experimenting with recently is what happens if you allocate all memory on the stack? This means not using any new or delete allocated memory. Certainly when pushing the limits of a small processor it can become very competitive for memory which can make judging the sizes of the stack and heap rather important. Both the stack and heap need a good safety buffer on their size so eliminating one does make some degree of sense. This also has the advantage that stack based allocations are far more size deterministic, creating a much tighter bounded memory consumption. Furthermore, once new and delete are no longer used that code doesn't need to be included into the executable freeing up a good amount of code space as well.
Obviously, entirely stack based allocation is not appropriate for all systems and please don't take this as me suggesting you should never allocate memory; if you need to use the heap please do. However, a fair amount of embedded work has fairly fixed quantities involved; I have X sensors, Y SPI peripherals, rarely is it like a desktop system where hardware might be connected and disconnected. It is worth considering if you can get away with skipping the heap, you will save a good amount of code and data space.
Well that is probably a deep enough journey into my crazy world for one day. I think the most important lesson to learn is that C++ has a lot to contribute to embedded projects, but please keep an eye on what the compiler is generating and which resources are in highest contention.
EDIT: I have recently found this rather interesting link which discusses memory management in console games, I think a lot if relates well to what I was getting at in my discussion of heap memory allocation. FYI, that is quite an interesting blog in general.