Google BigQuery is a cost-effective, highly scalable, serverless data warehouse designed for business agility. The platform manages all resources, scales automatically and is highly available. Consequently, warehouse administrators don’t have to spend time making typical implementation decisions like CPU sizing or maximum storage allocation.
Google BigQuery supports two different SQL dialects:
Click here for a complete explanation of the Google BigQuery Architecture.
Why Automate Google BigQuery With Qlik?
In fact, we can automate these areas and many more with the Qlik Data Integration (QDI) platform. The latest release of the QDI solution is optimized for Google BigQuery and automates many aspects of BigQuery operation. For example, we can use Qlik Replicate for near real-time change data capture (CDC) and Qlik Compose for Data Warehouses to build internal data warehouse structures and provision data mart datasets.
Qlik Data Integration For Google BigQuery
The following paragraphs describe features specifically created and optimized for data warehouse automation. They are as follows:
Data Warehouse Model Generation
Qlik Compose for Data Warehouses makes it easy to generate a logical model from source data by connecting to source and introspecting the metadata to produce a model that conforms to the third normal form(3NF)/Data Vault methodology. Qlik also provides support for assigning Type1/Type2 history attributes within your logical models, improving the data loading process for the BigQuery dataset tables.
Automated Mapping Generation
Qlik Compose for Data Warehouses will create the BigQuery dataset tables and generate mappings for the full and CDC ETL sets. In the ETL sets, Compose will automatically generate table mappings between the landing area columns and staging columns for the BigQuery Data Warehouse dataset tables. Ultimately, this reduces development lifecycle time.
Data Warehouse ETL Generation
Qlik automatically generates and orchestrates the SQL syntax required to execute the ETL process within BigQuery that complete the full load of the tables in the data warehouse. Qlik also automatically generates and orchestrates the SQL syntax necessary to load the CDC data into BigQuery datasets.
Data Mart ETL Generation
Qlik provides a data wizard that automatically creates a transactional data mart utilizing the BigQuery data warehouse data set. Qlik’s automation refreshes data in a transactional BigQuery data mart without complex manual transformation or scripting.
Workflow Generation and Orchestration
You can also use Qlik to generate a full and CDC workflow with tasks to load the BigQuery data warehouse and data mart tables. The workflows to load CDC data can be scheduled through Qlik or from a third-party scheduling tool.
Six Steps To Google BigQuery Automation
It just takes six steps to automate Google Big Query with Qlik, and they are as follows:
Step 1. Automate Data Ingest and Update
Configure Qlik Replicate Task to use Google BigQuery as an endpoint. Qlik Replicate will create the dataset in BigQuery from virtually any data source whether on-premises or in the cloud. Once Qlik Replicate completes the full data load, the tasks transitions to CDC mode to replicate source data changes in near real-time.
Step 2. Qlik Compose Source and Data Warehouse Configuration
Configure a Qlik Compose for Data Warehouses project landing and data warehouse connection. The data warehouse connection will provide details of the target data warehouse and data mart dataset. The landing connections specifies the BigQuery data set that will be used for integration within the data model.
Step 3. Automatically Generate the Warehouse Data Model
The model panel is used to intelligently discovered table metadata for the logical and physical data warehouse model. Manage Model controls which attribute changes and history types can be applied to the data warehouse model.
Step 4. Generate Data Warehouse Automation Instructions
The data warehouse panel selects tables that will be created as Google BigQuery tables in the data warehouse dataset. Table mappings will be created that manage the relationships between the landing and staging columns in the BigQuery tables. Once completed, the ETL process code is generated and ready for execution on BigQuery.
Step 5. Create Transactional Data Marts
The data mart panel provides a “New Star Schema” wizard to create the transactional star schema. The wizard starts by asking you to choose a fact table from the data warehouse dataset. You then select the parent dimension tables for the chosen fact table. You also select an attribute to serve as the transaction date for the schema. Once the schema wizard completes, BigQuery syntax is generated and executed to load create and load the data mart tables.
Step 6. Orchestrate the Warehouse Workflow
When the data warehouse CDC ETL are set, a task is created and added to the default CDC workflow in the Qlik Compose monitor screen. A data mart task is also added to the default CDC workflow.
Conclusion
Google BigQuery is a market-leading cloud data warehouse that provides exceptional performance when manipulating large data sets for analytic workloads. Although BigQuery offers virtually hands-free management, Qlik’s Data Warehouse automation solution will save you significant development time when used with the Google platform. Complete the following form to request a personalized demonstration of Qlik’s low-code data warehouse automation solution.