Backblaze Inc.

07/18/2024 | Press release | Distributed by Public on 07/18/2024 10:23

Why We Use Native Code in Backblaze Computer Backup

There's a lot that goes into building a user-friendly, robust backup utility. When Backblaze set out to create one back in 2007, our goal was to make sure that users of all skill levels would have automatic, nearly continuous backups that could be restored on command. There were plenty of design decisions to be made, and one of the biggest was whether to implement our client in native code.

You might have seen us talk about this on our website and elsewhere, and we felt it was high time to dive into what that decision meant for our development, how it affected the way the Backblaze client works, and why we think it was an important decision and inflection point for Backblaze Computer Backup and our customers.

What is native code?

Each kind of computer central processing unit (CPU), such as Intel/AMD or Apple Silicon, has its own "machine language," which is the set of instructions the CPU can understand and follow. These instructions are encoded in binary, and aren't something people can read or write without great effort. When folks talk about using native code, they're typically talking about a computer program that's written in machine language, so a computer's CPU can "natively" understand what the program needs the CPU to do.

Compiled languages

To use a compiled language, developers write instructions into source code that's easy for humans to read and edit. Then, they use a program aptly called a compiler to convert the source code into machine language for a particular kind of CPU. Examples of compiled languages are assembly (ASM), C, C++, Rust, Go, Swift, and Haskell.

Interpreted languages

Like with compiled languages, developers write programs in interpreted languages by writing instructions into source code files. But instead of converting those instructions into machine language, another program called an interpreter reads the source code and follows the instructions it contains without converting them to machine language. Common interpreted languages are things like Python, Ruby, BASIC, and PHP.

There is a bit of a slippery slope between a compiled vs. interpreted language. For example, some modern Java implementations mix an interpreter and a compiler. But, the difference when it comes to programming is about picking a language that's suitable to a task's requirements.

When and how do you use which type of code language(s)?

Well, pretty much anything anyone does on computers these days will take a combination of code languages. In some ways, the whole challenge of working with computers is bridging how humans communicate vs. how computers can process things.

If you were using a metaphor for the above, a compiled code language would represent someone who was raised to natively speak two languages, and could fluently curse in both languages.

By contrast, interpreted language is like this: You've moved to a country where you're not fluent in the language, but someone needs a thorough dressing-down. An interpreted language would let you write in your native language, take your words and literally translate the idiom you were intending to use-then the computer would take your literal translation, and, executing the program, would be supplied with a dictionary to then give you an effective, similarly meaningful, insult. If you didn't have your translator, your attempt at offense (in this metaphor, a program!), would likely fail because no one can understand you.

To wit: While they mean similar things, "when pigs fly," and "quand les poules auront des dents," do not literally translate.

What are the benefits of using native code in a backup application?

Using native code in a backup application is, in our opinion, better for several reasons.

Permissions

When you're writing in native code, you're plugging in your program at a lower level than most applications. That gives you access to the kinds of APIs the native operating system (OS) uses. Because you're in that level of integration with the operating system, it means that users have to update permissions less frequently, have access to more robust build possibilities for your client, and their backup client can seamlessly run in the background.

Efficiency: Build once, run everywhere

By building our backup client lower in the chain of command, so to speak, it allows us to use the same work for different situations, and there are some interpreted languages that have been built for this purpose, like Java VM. Using those solutions, however, would sacrifice some of the other benefits we're outlining in this article.

Being fully in control of our common code, we can do this without interpreted language and still have the other advantages listed here. So, we can use the same base code for both our Mac and Windows clients, but then add modifications to the code on top of each to refine the clients. There may be slight differences between the operating system (OS) environments, but coding at the level of a compiled language like C++ means that we can adjust for those differences effectively.

Performance

Running native code typically results in better performance. That's because there are fewer steps (for your computer) between understanding a program and running a program.

Backup programs run all the time in the background, and have to keep track of a lot of information. Backblaze's native code does that using half to a tenth of the computing resources that a backup program written in an interpreted language would use. So, Backblaze won't slow down or interrupt the other activities you're doing with your computer.

Reducing software bloat and size of software

Also, since you don't have to install interpreters (you know, your insult dictionary), native code applications are usually leaner and more performant on the system.

Eliminating risky third-party dependencies

Since they're software, computer language interpreters have bugs and get new features, so they're frequently updated. Sometimes an updated interpreter won't run programs written for an older version of the language, or will cause a program to behave differently in an unexpected (read: "buggy") way. Also, vendors have even changed licensing terms and started charging money for interpreters that had been free. Backblaze's native code doesn't have those problems.

Platform-standard user interface

Operating system vendors like Microsoft and Apple strongly encourage developers to write programs that use a platform-standard user interface "look-and-feel." Programs that do that help users feel comfortable, minimize surprises, and support accessibility features like text-to-speech.

The most effective way to ensure a program's user interface matches a platform's standard look-and-feel is to use features built into the operating system, and those are typically only available to native code like Backblaze's client.

What are the challenges of using native code in a backup application?

Nothing is perfect. What are the downsides to this approach?

Industry preference moving towards interpreted language/web apps

Has anyone else noticed that the world of development has changed recently? (No need to qualify that statement-it will be true tomorrow, tomorrow, and tomorrow again.)

As with any industry, tech's (and developers') favorite strategies for creating things and solving problems have changed over time.

There are various players in this space, including platforms (Mac, Windows, Linux), software (Adobe, Office), applications (Slack, the latest mobile game, your headphone utility client), and frankly, many things that skirt the boundaries of the above buckets. Executing any program, and particularly third-party applications, is a negotiation between operating systems' publishers and the program/application's developers.

Over time, those who sell computers and manage OSes have grown to prefer the lightweight development of application ecosystems. It lets them have more control over their platforms, and it gives developers a shorter time to deployment-as long as they play within the sandbox the OS has made available. OS publishers are attempting to anticipate the needs of program and app developers, but there are some types of utilities-and backup is one of them-that justifiably break standard rules. Giving access to all your files by default, for example, isn't something you'd do for a social media application. However, in order to get a full and complete backup, a program does justifiably require that level of access.

Limited dev libraries

Given the preference of developers to move to web applications and interpreted languages (for good reason in some cases), many OSes are releasing less detailed support and/or technical documentation for some of their deeper-level tools. If you're implementing in native code in today's environment, you need both historical knowledge and ingenuity in house. Which leads us to our next point…

Expertise

We're on board with the evolution of development-innovation is at the heart of our company-but for aspects of our backup client, we need developers with a deep understanding of compiled code languages and our supported ecosystems. And luckily, in any sufficiently large tech company, you'll find folks specializing in different code languages and parts of the tech stack. That means we can spend more time nurturing and developing our internal talent rather than seeking it externally.

Hybrid approaches?

Hey, we've spent a whole article telling you why native code matters. But, many folks agree that the future requires a hybrid approach, largely because of that gray area between compiled and interpreted languages we mentioned above. You can certainly see that in our style as well-our Mac client uses a combination of Objective C, SwiftUI and C++, for example.

The now and future Backblaze

The core functionality of our client depends on native code for very good design reasons, and they're ultimately all about making things easier for our end users.

Overall, our design ideas are all centered on what it means to use Backblaze every day, regardless of an end user's skill level. We want things to be simpler, and sometimes the questions we need to answer (how do I make sure the Backblaze client backs everything up?) are actually a tad more complicated upfront (the Backblaze client needs system permissions-and that means implementing it in native code), in that they require forethought and an investment of time and resources. But, we also prioritize the kind of thinking we can use over and over-so, even if we spend a little more time building native code, it's an investment that has longevity. Put another way: Build once, run everywhere.

print