A Q&A with Instacart CTO, Mark Schaaf
It’s been a busy few months.
We’re seeing the highest customer demand in Instacart history and have more active shoppers on our platform today than ever before, picking and delivering groceries for consumers across North America. During this crisis, our Engineering, Product, Design, and Data Science teams went into overdrive to strengthen our infrastructure and add new features into the app to better reflect everyone’s new normal — bulk buying and low item availability.
COVID-19 has changed the way everyone gets their food, and it’s required us to redesign our products and systems to meet the 500% increase in year-over-year order volume we’re seeing. We’ve sat down with our CTO, Mark Schaaf, to take a look at some of the moves we’ve made behind the scenes to keep our site stable and maintain service during this time.
Let’s start at the foundation — how has this influx of demand affected Instacart’s Infrastructure?
Rapid scale has a way of revealing bottlenecks. Instacart has experienced three years of projected growth in 30 days, which can put a huge strain on your systems. We’ve essentially had to break and reset our technical infrastructure to rectify our expected growth timeline with the order growth that was happening on the platform as shelter in place orders rolled out across North America.
Our number one mission throughout March was to scale out our infrastructure ahead of our trajectory of 20% day-over-day growth. This infrastructure is the foundation for our four-sided marketplace including our customer-facing app, shopper app, enterprise software, and advertising engines. As traffic and orders increased, we had dramatically different read and write usage patterns and had to change the configuration datastores, often making upgrades multiple times a week. I’m proud of how the team has been sprinting to implement three years of technical scale immediately.
How has the product changed?
Over the last two months, we’ve had to completely reset our product roadmap to focus on features that prioritize shopper and customer safety.
Instacart is a data-driven organization — we build and test often, and use as much data as we can gather to inform new features for our customers and shoppers. In March, the team had to walk a fine line between maintaining our data-driven decisionmaking culture and launching features we know in our gut are the right move.
One of the most noticeable new features we launched is “Leave at My Door Delivery.” We began developing this feature in late 2019, and at the onset of the pandemic, we observed more and more customers in test groups opting into Leave at My Door Delivery. Knowing how critical this could be to customer and shopper safety, we fast-tracked the experiment window and pushed the feature nationwide in mid-March to meet demand.
We made a similar decision for a critical feature we were testing in the Shopper App. Early in 2020, we piloted a Mobile Checkout feature, which allows shoppers to pay at the register using Apple Pay or Google pay. We were piloting this in a few select metro areas, gathering data to inform a larger rollout. In March, we recognized the immediate need for Mobile Checkout and shipped it nationwide, opening up the floodgates. We also greenlit a more streamlined alcohol delivery feature that allows shoppers to scan a customer’s ID upon delivery, eliminating the need for customers to sign a shopper’s phone to help adhere to social distancing parameters.
We rolled out Fast and Flexible ordering, which lets customers forego normal pre-scheduled 1-2 hour delivery windows and opt for the first delivery window to open up in their area. To put it simply, it allowed customers to add their orders to a “first available” delivery queue. This went from idea to test to national roll out in about two weeks.
What about item availability?
Where to start? At the beginning of March, we saw nearly a 30% drop in ordered items being found. The average customer basket size has also grown by more than 35% month-over-month. At the onset of the shelter in place orders, surges in demand, bulk buying, and quantity restrictions on certain items (like bathroom tissue and flour), created a gap between what’s in-store and what was reflected in-app. We needed to make a lot of changes to get a better understanding of what items will be in stock on the day of delivery and set appropriate expectations with customers as they fill their carts.
On the back end, the team quickly built out a tool allowing retailers to send us maximum item quantities. This allowed us to roll out retailer-specific item maximums in our customer and shopper apps that match in-store policies. We can now track the different maximums across more than 350 retail partners and over 25,000 stores every day.
Our Machine Learning team mobilized quickly to make substantial changes to our Item Availability Model in short order to reflect the reality on grocery store shelves. We doubled the rate at which we were running our item availability model, running it every 60 minutes to better understand availability fluctuations throughout the day. Previously, our availability model looked back at a 30-day period to make solid availability predictions — now, we’ve narrowed that range to one week, and even three-day windows to better understand what products are flying off the shelves. One week it’s hand sanitizer, the next it’s flour.
We more than doubled the number of products scored by the model and narrowed the window of historical shops that we look at to reduce noise.
At the Product level, this fine-tuned understanding of what’s in stock informed new features to help set expectations for customers and shoppers. We now can automatically filter out of stock items out of search results. If you were looking for baking yeast in early April, for example, and couldn’t find it, this is why it didn’t pop up in your search results ?.
We also added visible “out of stock badges” to item listings that were likely to be out of stock.
How has last-mile delivery changed?
Algorithmically, our last mile looks very different now than it did eight weeks ago. We’ve spent years building a carefully choreographed string of Machine Learning models to predict demand, understand capacity, and make order dispatching decisions just in time. Those have all been thrown out of whack due to long social distancing lines outside of stores, and frequently changing store hours as our retail partners respond to increased demand.
We’ve rapidly adjusted our fulfillment capacity model that calculates just how much order capacity we have at any given moment throughout the day as shoppers log on and off and orders are completed.
This model relies on a host of data like the number of shoppers on the platform in a given area, the shopping speeds at any given retail location, and the number of orders in our queue at different locations to understand our true order capacity on the ground.
Since the onset of this crisis, we’ve doubled the rate at which we run this model, taking into account new store hours and restocking times. We re-compute the model every two minutes to get a near real-time understanding of our true fulfillment capacity on the ground. That’s why sometimes customers can check their app throughout the day and see new fulfillment times open up.
Want to solve hard technical problems like these? Join our Hacker Org! Visit our careers site to see our current openings.