Skip to Main Content

Unlocking Data, Accelerating Insights

The UCLA Data Lakehouse is a campus wide data repository designed to empower the UCLA community with seamless access to integrated, high-quality data for analytics and decision-making.

 

The UCLA Data Lakehouse is a modern data architecture that blends the benefits of a data lake and data warehouse into a single, unified platform.  Like a data lake, it can store large volumes of raw and varied data from multiple sources, including structured, semi-structured and unstructured data sets.  Like a data warehouse, it supports fast, reliable queries and analytics by organizing and optimizing structured data for performance and governance. 

This hybrid approach allows UCLA to ingest and manage data at scale while making it readily accessible for reporting, dashboards, visualizations, machine learning and advance analytics.  The result is a more agile, efficient and modern data environment that supports innovation, equity and data-informed decision-making across the institution. 

UCLA Data Lakehouse Diagram

UCLA Data Lakehouse - Modern Data Platform

The Data Lakehouse was built as a one-stop enterprise repository of university data that provides unique capabilities to combine information across domains. It is a foundational framework that applications, analytics platforms, and intelligent tools can use to access information in real-time. Its purpose is to serve as a shared analytics space that allows UCLA to shift from decentralized data siloes to managed and timely data sharing. It is an important component of our modern data platform as shown to the right.

EDA Data Platform Model

Unified & Integrated

A unified repository for UCLA that streamlines the ingestion of data, enhances integration across diverse data sets for broader insights and ensures secure data management standards across domains.

Interoperable

A shared analytics environment allows UCLA to transition from decentralized silos to a structured, secure, and efficient framework that enables timely and managed data sharing across BI tools and reporting platforms.

Governance

A centralized hub for data assets that establishes access policies, standardizes definition, enables data stewardship and guides appropriate data usage.

Data currently in the Data Lakehouse

The following data has been successfully curated and ingested into the Data Lakehouse:

UCPath

BruinCard

AQMD (Air Quality Survey)

UID

Personnel/Payroll Historical Data

Student Data

Wi-fi

Salesforce

Canvas (Bruin Learn)

Parking

Pay Station Parking

Kronos

How do I take advantage of the Data Lakehouse for my department's data analysis needs?

Please contact us at ucladatalake@it.ucla.edu.