My Learning Space
Space to take notes, learn and share.
Azure Machine Learning Pipelines
Azure machine learning pipelines are workflows of executable steps that enable users to complete Machine Learning workflows. Executable steps in azure pipelines include data import, transformation, feature engineering, model training, model optimisation, deployment...
Introduction to Azure Databricks
Databricks is a cloud-based data engineering tool used for data transformation and data exploration through machine learning models. Azure Databricks is Microsoft Azure Platform’s implementation of Databricks. Evolution of Databirkcs : A short timeline of evolution...
Data Modeling in Power BI
PowerBI provides a handful of features for building robust data models. Here are a few concepts to begin modeling data in PowerBI : Fact tables & Dimensions tables: In its simplest form, a data model design will consist of the following: Fact table: Also...
Dictionary in Python
Python Dictionary A dictionary in Python is a data structure to store data in Key: value format. There are several similarities between python lists and dictionaries, however, they differ in how their elements are accessed. List elements are ordered and are accessed...
Loading Data from Kaggle to Amazon S3 Bucket
Loading data from kaggle directly into S3 is a two step process. In first step we configure Kaggle to be able to download. And in second step, we extract data from Kaggle into S3 bucket. Get data from Kaggle To get data from kaggle, we setup Kaggle command line tool...
Correlation Coefficient – Simplistic explanation
What is Correlation Coefficient Correlation means a mutual relationship or connection between two or more things (variables). The correlation coefficient is a numeric measure to quantify this relationship. The coefficient describes two aspects of a relationship....