Engineering Dropbox Transfer: Making simple even simpler

One of the challenges of application engineering within an established company like Dropbox is to break out of the cycle of incremental improvements and look at a problem fresh. Our colleagues who do user research help by regularly reminding us of the customer’s perspective, but so can our friends and family when they complain that product experiences aren’t as simple as they could be.

One such complaint, in fact, led to a new product now available to all our users called Dropbox Transfer. Transfer lets Dropbox users quickly send large files, and confirm receipt, even if the recipient isn’t a Dropbox user. 

You could already do most of this with a Dropbox shared link, but what you couldn’t do before Transfer turned out to be significant for many of our users. For instance, with a shared link, the content needs to be inside your Dropbox folder, which affects your storage quota. If you are just one large video file away from being over quota, sending that file presents a challenge. And one of the benefits of a shared link is that it’s always connected to the current version of the file. This feature, however, can become a hassle in cases where you want to send a read-only snapshot of a file instead of a live-updating link.

The more we dug into it, the more we realized that file sharing and file sending have very different use cases and needs. For file transfers, it’s really helpful to get a notification when a recipient has downloaded their files. This led us to provide the sender with a dashboard of statistics about downloads and views, prompting them to follow up with their recipient if the files are not retrieved. And unlike sharing use cases where link persistence is the expected default, with sending cases many people prefer the option of ephemeral expiring links and password protection, increasing the security of confidential content and allowing a “send and forget” workflow.

Because of these differences we chose to build an entirely new product to solve these sending needs, rather than overcomplicating our existing sharing features. Listening to the voices of people around us (whether Dropbox users or not) helped us break away from preconceived notions based on what is easy and incremental to build on top of the Dropbox stack.

This blog is the story of how Transfer was built from the point of view of its lead engineer, from prototyping and testing to production.

Know your customer: the engineering edition

As software engineers, we’re used to optimizing. Engineers have our fingerprints all over a piece of software. Things like performance, error states, and device platform strategies (what devices we support) are disproportionately decided by engineers. But what outcomes are we optimizing for? Do we focus on performance or flexibility? When aggressive deadlines hit, which features should be cut or modified? These judgements of cost vs. value are often made by an engineer hundreds of times in a typical month.

To correctly answer these optimization questions, engineers must know our customers.

Research is all around us

Ideation

Product development, as with machine learning, follows either a deductive or inductive reasoning path. In machine learning there are two major methods of training: supervised (deductive), and unsupervised (inductive). Supervised training starts with a known shape of inputs and outputs: e.g. trying to curve fit a line. Unsupervised learning attempts to draw inferences from data: e.g. using datapoint clustering to try to understand what questions to ask.

In deductive product development, you build a hypothesis, “users want x to get y done,” and then validate with prototyping and user research. The inductive approach observes that “users are exhibiting x behavior,” and then asks, “what is the y we should build?” We built Transfer with the first approach and are refining it with the second. I will focus on the first in this post.

So how does one come up with a hypothesis to test? 

There are many ways to come up with these initial seeds of ideas: open-ended surveys, rapid prototyping and iteration, focus groups, and emulation and combination of existing tools or products. Less often mentioned within tech circles is the easiest method of all: observing and examining your surroundings. Fortunately, at Dropbox, problems that need solving are not hard to find. Research is all around us, because the audience is essentially anyone with digital content. If we listen, we can let them guide us and sense-check our path.

This is how Transfer got its start. My partner complained to me that they never could use Dropbox despite my evangelizing. It was simply too hard to use for simple sending. “Why are there so many clicks needed? Why does it need to be within a Dropbox folder to send? Why do I have to move all the files I uploaded into a folder?” When I heard we might be exploring this problem, I jumped at the opportunity. At the very least, I might be able to persuade my exacting sweetheart to use Dropbox! 

As the product progressed, I gained more confidence: I wasn’t sure if my accountant had received the files I sent with an email, a videographer friend wanted a quick way to send his raws over to an editor for processing. What started as a personal quest to persuade my partner quickly became a very global effort. Turns out she isn’t the only one who wants a new one-way sending paradigm within Dropbox. Not all tools are as general-purpose as Transfer, but overall, listening closely to people’s needs and feedback can quickly give directional guidance. For me, personally, it amplified my confidence that Transfer can have a large impact.

This is one of the reasons, after five years, that I keep working here: Dropbox users are everywhere. My dad in his human biology research lab storing microscope images and files containing RNA; my neighbor storing contracts in Dropbox so they can read them on-the-go; some DJ in a club, choosing what track to queue up next using our audio-previews. Being a part of the fabric of everyday people’s lives is an incredible privilege.

TL;DR: If you’re not sure if something makes sense, just ask a friend or two who might be in the target audience as a sense-check.

Path to validation

After these initial few sparks, from my experiences and the experiences and research of those on the team, we were ready to test out the idea. We attempted to clearly and strongly define the idea to either be right, or completely wrong. We did not want inconclusive results here, as that would waste us months or years of time. We set out to prove or disprove that, “Dropbox users need a quicker way to send files, with set-and-forget security features built in, like file expiration.”

We started with an email and a sample landing page test: would people even be interested in this description? Turned out they were. Then, curious about actual user behavior, we graduated to a prototype with all the MVP features. In parallel, we ran a set of surveys, including one based around willingness-to-pay to make sure there was a market out there. Later on we started monitoring a measure of product-market-fit as we released new features (more on this later).

As an engineer, it’s important to always understand this hypothesis and feel empowered to push back and suggest cutting scope if a feature doesn’t bubble up to the core hypothesis. This helps product and design hone their mission, users have a cleaner experience, and engineers reduce support cost for features that only 0.1% of users will ever use. A common trap of the engineering “can-do” attitude is enabling feature creep, and eventually a hard-to-manage codebase and a cluttered product. As with product and design, code should seek to be on-message, with the strongest and most central parts corresponding to the most utilized and critical components to an experience.

Prototyping and building are the same optimization problem

When an engineer starts optimizing for the customer and their use-case, the underlying technology and approach becomes bound to the spectrum of their needs.

Code quality as a spectrum

Every good engineering decision is made up of a number of inputs. These are things like:

  • complexity to build
  • resource efficiency
  • latency
  • compatibility with existing team expertise
  • maintainability

Given these traditional criteria, engineers might often fall into the trap of over-optimizing and unwittingly targeting problems the customer doesn’t care about.

Making sure to always add customer experience to these engineering inputs will help align efforts to deploy the highest code quality on the highest customer-leverage areas. If you consider a user’s experience to be an algorithm, this really is just a riff on the classic performance wisdom that comes out of Amdahl’s law: focus on optimizing the places where customers are spending (or want to spend) the most valuable time.

Remember: Hacky technical solutions can be correct engineering solutions. Optimizing the quality of unimportant parts will only lead to unimportant optimizations.

Please note: I’m not advocating for writing a lot of messy fire-prone code, just for staying aware of the big picture at all times.

Part I: The throwaway prototype

We built a product we planned to delete in 2 months.

Why not just build the actual thing?

When exploring new product spaces, it is unclear where the user need (and/or revenue) lies. It’s usually helpful to decouple product learning from sustainability. When building a completely new surface, the optimized solutions for each of these are usually never the same.

  • Learning: Optimize for flexibility. Do whatever it takes to show something valuable to a small set of users. This type of code might not even be code, but rather clickable prototypes built in tools like Figma.
  • Sustainability: Optimize for longer-term growth. This type of code might include things like clearly-delineated, less-optimized “crumple zones” that can be improved as the product scales and needs to be more efficient. It should also include aspirational APIs compatible with extensions such as batching or pagination.

How we did it

Smoke and mirrors. We took an existing product, forked part of the frontend and applied a bunch of new CSS to make an existing product based around galleries become a “new” one based around a list of files. Only a few backend changes were needed.

Mindful of its eventual removal, we surrounded all the prototype code with comment blocks like:

/* START: EXPERIMENT(TRANSFER) */
<code>
/* END: EXPERIMENT(TRANSFER) */

So we could quickly clean up after we were done.

Results? After a month with it, people were sad to see it go, a sentiment we quantified with the Sean Ellis score. Sad enough to see it go that we had to take this to part II.

Part II: The enduring product

When it came time to tear down the temporary product—a prototype of hacks built on more hacks—our team needed to decide how we’d build the real thing.

Fortunately, Transfer is built on the concept of files, and files are something that Dropbox does well regardless of what they’re being used for. Our efficient and high-quality storage systems, abuse prevention, and previews pipelines, optimized over many years, fit directly into our product. Unfortunately, moving up the stack, the sharing and syncing models could not be reused. While the underlying technology behind sharing and the sync engine has been rebuilt and optimized over the years (with the most recent leap in innovation being our Nucleus rewrite), the product behavior and sharing model had remained largely unchanged for the last 10 years.

The majority of the sharing stack assumed that files uploaded to Dropbox would always be accessible inside of Dropbox and take up quota. Additionally, there was also an assumption that links would refer to live updating content, rather than a snapshot of the file at link creation time.

For the file sync infrastructure, there was an assumption of asynchronous upload: the idea that content would eventually be added to the server. For a product that was meant to immediately upload and share content while the user waits, the queuing and eventual consistency concept that had worked for file syncing would be disastrous for our user experience.

While sync and share seemed homologous to sending, their underlying technologies had many years of product behavior assumptions baked in. It would take much longer than the seven months of development time we had to relax these, so we chose to rebuild large pieces of these areas of the stack rather than adapt the existing code (while leveraging our existing storage infrastructure as-is). The response to our prototype had given us the conviction to take this harder technical path in order to provide a specific product experience, rather than change the product experience to match the shortest and most immediately-compatible technical approach.

It’s important to note that each decision to “do it ourselves” was done in conversation with the platform teams. We simply needed things that were too far-out and not validated enough to be on their near-term roadmap. Now that Transfer has proven to be successful, I’m already seeing the amount of infrastructure code our product-team owns shrinking, as our product partners add flexibility into their systems to adapt to our use-cases. In lieu of taking a hard dependency on our platform-partners, we were able reduce temporary inter-team complexity and accelerate our own roadmap by building our own solution. Our habit of choosing to actively reduce cross-team dependencies also proved essential in hitting our goals.

When working in established codebases, here are some tips to keep things moving fast:

Be creative

Similar products are closer than you think. In our case, we found that sending photo albums had many similarities with what we were trying to do. This cut off months of development time as we were able to leverage some ancient, but serviceable and battle-tested, code to back our initial sharing model.

Always ask about scale

At large companies processes are often developed to work at the scale of their largest product. When working on new projects with user bases initially many orders-of-magnitude smaller than a core product, always start a meeting with another team by telling them your expected user base size in the near future. Teams might assume you’re operating at “main product scale” and base guidances around this. It’s your job to make this clear. This can save your team from never launching a product because they’re too busy solving for the 200M user case before solving for the 100 user one.

Learn about users other than yourself

One thing we did early on was to build a set of internationalized string components. This took us extra time initially, but, armed with the knowledge that roughly half of all Dropbox users speak a language other than English, we knew the user impact would be well worth our time. One of our prouder engineering moments was when we were about to launch to the initial alpha population and got to tell the PM, who had assumed we hadn’t translated the product, that we should include a global population in the pre-release group. She was ecstatic the engineers had self-organized and decided this needed to be done.

Know what can change and what can’t

Sometimes things just can’t be done in a timely fashion. If they don’t contribute to the core identity of the product, consider letting them go. For us, this was the decision to initially not support emailing Transfers. Sending by copying a link was good enough for beta.

Always know where you are and where you want to be

When reviewing the specs for wide-reaching changes or reading the new code itself it’s useful to ask two questions:

  1. Where is this going? What is the ideal state of this?
  2. Where along this path is this? What number of stepping-stones should I expect to be looking at?

We would constantly step back from decisions around what’s important and what’s not in terms of building the core identity into the product. We’d also constantly assess how easy it would be to change our minds if we had to (Type I vs Type II decisions, in Jeff Bezos’ lingo).

Some of the hardest calls were around our core, critical path code, code that would process hundreds of thousands of gigabytes per month. These were for us, the (initial) file uploader and our underlying data storage model, inherited code which was neither the cleanest nor the best unit-tested. Due to time and resource constraints, we had to settle for simply “battle tested” and “present” over other factors.

The file uploader we chose for the web version of Transfer was the same uploader used on a number of other areas of the website, primarily file browse and file requests. This uploader was based on a 2012 version of a third-party library called PLupload, a very full-featured library providing functionality and polyfills that would go unused by our product. Since this uploader worked, it was hard to justify a rewrite during the initial product construction. However, as this library (at least the 2012 version of it) was heavily event driven and mutates the DOM, it immediately started causing reliability issues when wrapped inside a React component. Strange things started happening: items would get randomly stuck while uploading during our bug-bashes. Long-running uploads would error due to DOM nodes disappearing suddenly, causing a cascade of node deletions as React tried to push the “hard-reset” button on the corrupted DOM tree. We chose to keep it, but as abstracted away as possible.

We took a similar approach to the Nucleus migration: we started out by building an interface exposing every feature of PLupload we wanted to use. This interface consisted of our own datatypes rather than PLupload’s. This served two roles:

  1. Testing got much better as we had a boundary. We were able to both dependency inject a mock library to test the product code, and also connect the inner code to a test harness with clear expectations around I/O of each method.
  2. The added benefit of this boundary was that it would eventually act as a shim when we had time to swap out the library with something simpler. This also forced us to come up with our requirements for a rewrite ahead of time, greatly increasing the productivity of the rewrite.

The underlying sharing code we chose was based not around the well-maintained Dropbox file and folder sharing links, but rather a much older link type created initially for sharing photo albums. These album links allowed us to quickly stand up functional Transfer links. The benefit was that we were adapting a known system: other teams knew what these links were. Customer experience team was able to reuse playbooks and guides surrounding these links, Security and Abuse teams already had threat models and monitoring on these links, and the team owning the sharing infrastructure already had context. By not having to build new infrastructure, we were able to reduce variables, allowing us to focus more on product development than foundational changes.

To allow us to migrate to a new system later, as with the web uploader, we wrapped this ancient set of helpers in our own facades.

As our scaling up and launch played out, it became clear we had made the correct architecture calls: These parts had held and performed. Good engineering can be as much about action as it is about restraint. When we did revisit these later, we had the space to take a more thoughtful and holistic approach than we would have months earlier.

Note: In early 2020 we migrated entirely off storing our data in the photo-album system, giving us both reliability, maintainability, and performance improvements.

A strong culture of discourse

Each crossroad can be critical. Having a culture of inclusion where each voice is considered based on an argument’s merit is an essential component of making key technical calls correctly. When core questions like the ones above come up, answering them incorrectly can lead to long detours, future tech debt, or even product collapse. Sometimes maintaining a badly-built product is more costly than the value it creates.

In the specific case of reusing the photo album code, one of my teammates vehemently opposed the idea of taking on the tech debt from this system. The ensuing discussion over the course of a few weeks resulted in a number of proposals, documents, and meetings that uncovered and evaluated the time commitments required for different alternatives. Although we chose to take the photo album code with us to GA, the points raised through these meetings galvanized the short-term approach, backfilling the thoughtfulness that was either unspoken or lacking in its initial proposal, and brought the team together on a unified short-and-long-term vision for our sharing model’s backing code. These meetings helped set the eventual roadmap for complete migration off of the system.

Without a well-knit team motivated to speak their mind at each intersection, the quality of these decisions can grow weak.

I was lucky enough to work with 10 amazing engineers in a culture of open discourse on phase II of this project. I remember many afternoons spent working with the team to find our best path forward, balancing concerns from one engineer’s lens and another’s. Throughout, the glue that kept us moving forward through these discussions was the user need. Before each meeting, we’d try to articulate the “so what” of it all, and at the end try to bring it back again to the user. Whether this was a due-date we needed to hit for a feature or a level of quality expected, we could all align around the user as our ultimate priority. When we disagree, we ask “What would the user want?” and use that as our compass.

Keeping that customer voice growing in each of us, through things like subscribing to the feedback mailing list or participating in user research interviews has proved crucial to our success as an engineering team. It is not enough to just have product managers, designers, and researchers thinking about the customer, engineers must as well.

Epilogue

As Dropbox Transfer adoption grows and the product expands and evolves, it remains important to reflect on its roots. The Transfer team remains committed not just to the “what,” of the product but the “why.” As we learn even more about our users from their feedback, we realize the process has just begun. Now every time we hear someone at the office or in a coffee shop complain about a file they were unable to send, we prick our ears, roll up our sleeves and smile, knowing that our work is not done yet. Best of all, my partner currently uses Transfer—and finally admits that I do useful things during the day.

Source: Dropbox