# Econometrics techniques for data science

## Methods, models, tools and business solutions

I wrote an article a while ago about econometrics (Econometrics 101 for Data Scientists). The article resonated well with readers, but that was a kind of introductory article for data science people who might not be otherwise familiar with the domain.

# What is econometrics

Econometrics is a sub-domain of economics that applies mathematical and statistical models with economic theories to understand, explain…

# Avoid Overfitting with Regularization

## L1 and L2 regularization in LASSO, Ridge & ElasticNet regression

You are not alone if you had a hard time understanding what exactly Regularization is and how it works. Regularization can be a very confusing term and I’m attempting to clear up some of that in this article.

# What is the problem?

Data scientists take great care during the modeling process to make sure their models work well and they are neither under- nor overfit.

# Probability distribution: an intuition for data scientists

## Intuition and use cases of Gaussian, Binomial and Poisson distribution

As a data scientist if you are asked to find the average income of customers, how’d you do that? Having ALL customer data is of course “good to have”, but in reality, it never exists nor feasible to collect.

# Logistic Regression: From Statistical Concept to Machine Learning

## From simple intuition to complex model building process

Logistic regression is amongst the most popular algorithms used to solve classification problems in machine learning. It uses a logistic function to model how different input variables impact the probability of binary outcomes. The technique is quite convoluted as described in the available literature. The purpose of writing this article is to describe the model in simple terms, primarily focusing on building an intuition by avoiding complex mathematical formulation as much as possible.

# Working with Python Lists: a Cheatsheet

## Methods, functions and use cases of Python lists

After writing a few pieces on topics like econometrics, logistic regression and regularization — I’m back to the basics!

# Breaking into data science

If you want to get into the field of data science and machine learning here’s your pathway:

## Step 1

Learn Data Science 101. Focus on what business problems they solve, what are different sub-disciplines etc. Follow people in data science and see what they say. Maybe listen to some data science podcasts.

## Step 2

Pick a language you are comfortable with. I recommend Python, for various reasons I won’t go into today.

## Step 3

Get a superficial understanding of common data science and machine learning algorithms and how they work, what business problems they solve:

• Descriptive statistics
• Linear regression
• Logistic regression
• Nearest Neighbors
• Decision-trees
• Clustering
• Time…

# General intuition

If we had more money in our pockets, we tend to spend more — that’s almost a fact that everyone knows. But what’s often not known is the exact relationship between income and expenditure, i.e. how much people would spend on a known income.

# Python loops: Some beginner-friendly looping challenges

## Looping through lists, tuples, dictionaries and strings

In programming, loop is a logical structure that repeats a sequence of instructions until certain conditions are met. Looping allows for repeating the same set of tasks on every item in an iterable object, until all items are exhausted or a looping condition is reached.

# What is looping

In programming, looping means repeating the same set of computations in the same sequence for a number of times.