
Optimizing data warehouse storage
By Anupom Syam Background At Netflix, our current data warehouse contains hundreds of Petabytes of data stored in AWS S3, and each day we ingest and Read More
By Anupom Syam Background At Netflix, our current data warehouse contains hundreds of Petabytes of data stored in AWS S3, and each day we ingest and Read More
There are many ways to configure the cache in a microservices system. As a rule of thumb, you should use caching only in one place; Read More
In case you missed it, last week was held the first Data+AI Summit (formerly Spark+AI Summit) and we had a chance to participate. The talks Read More
We build AI software in two modes: experimentation and productization. During experimentation, we are trying to see if modern technology will solve our problem. If Read More
Today, we announced the new SQL Analytics service to provide Databricks customers with a first-class experience for performing BI and SQL workloads directly on the Read More
We are excited to announce that we have just released BigFlow 1.0 as open source. It’s a Python framework for big data processing on the Read More
In this first of two blogs, we want to talk about WHY an organization might want to look at a lakehouse architecture (based on Delta Read More
Today’s most valuable data is locked away in silos. Its producers have most likely never been remunerated for their contributions, nor will they ever be Read More
Headlines abound on the opportunities and dangers of machine learning. Now armed with plenty of examples, the Data Science community is at least aware that Read More
For All Aspiring Big Data Engineer Source: Codementor