Manan Younas | Blog to take notes, learn and share

Vectorisation

A vector in the context of NLP is a multi-dimensional array of numbers that represents linguistic units such as words, characters, sentences, or documents. Motivation for Vectorisation Machine learning algorithms require numerical inputs rather than raw text....

Data Abstractions in Spark ( RDD, DataSet, DataFrame)

Spark provides three abstractions for handling data RDDs Distributed collections of objects that can be cached in memory across cluster nodes (e.g., if an array is large, it can be distributed across multiple clusters). DataFrame DataFrames are distributed collections...

Snowflake Object Model

The Snowflake object model is a hierarchical framework that organizes and manages data within the Snowflake cloud data platform . An "object" itself refers to a logical container or structure that is used to either Store data, Organize data, or Manage data. From the...

Exporting GA4 data from BigQuery to Snowflake

Exporting GA4 data from BigQuery to Snowflake In a previous article, we have already explored how to export data grom GA4 to BigQuery. In instances, where we want to migrate data from BigQuery to another platform like snowflake, BigQuery offers a few options. BigQuery...

Implementing tests in DBT (Data Build Tool)

Data Build Tool ( DBT ) offers functionalities for testing our data models to ensure their reliability and accuracy. DBT tests help in achieveing these objectives . Overall, dbt testing helps achieve these objectives: Improved Data Quality: By catching errors and...

Creating data pipeline with dbt & snowflake

In this article we will follow along complete dbt(Data Build Tool) bootcamp to create an end-to-end project pipeline using DBT and Snowflake. We begin by outlining the steps we will be performing in this hands-on project. 1.Loading Data from AWS to Snowflake: Data...

My Learning Space

Vectorisation

Data Abstractions in Spark ( RDD, DataSet, DataFrame)

Snowflake Object Model

Exporting GA4 data from BigQuery to Snowflake

Implementing tests in DBT (Data Build Tool)

Creating data pipeline with dbt & snowflake