This 11-video course discusses the transition of data warehousing to cloud-based solutions using the AWS (Amazon Web Services) cloud platform. You will examine various implications involved in storing different types of data from different sources within an organization. You will need to be familiar with provisioning and working with resources on the cloud, basic big data architecture, distributed systems, using shell commands, and a Linux terminal prompt. You will learn that an organization may have data silos which may prevent access to other teams within an organization. You will learn how to use data lakes, a centralized repository to store data at scale, and as a viable solution to data silos that might exist within an organization. You will learn the difference between a data lake which stores all kinds of raw data in a native format before the data has been processed, and a data warehouse which contains data that can be used so directly to generate business insights. Finally, this course demonstrates storing data with AWS Redshift data warehouse.
Data Silos, Lakes, & Streams: Introduction