Machine Learning Foundation for Product Managers

Shivam Agarwal
2 min readJan 18, 2022

--

What is machine learning: In simple words machine learning is when machine learns itself without the need of any specific instruction or rules.

What is the difference between traditional software development and machine learning: In traditional way, we provide logics or rules to transform input into output. In machine learning, we provide data and machine learns from the data to create own rules to generate output from input

Credit Jon Reifschneider

What is the difference between AI and Machine Learning: AI is the broader concept to create machine which mimic the human intelligence. While machine learning uses the technique conceptualize in AI to achieve the goal of generating output from input. So machine learning is the subset of Artificial intelligence.

Some basic data terminology :

a. Structured Data: When we have predefined data format and each record follow the predefined format ex. Spreadsheet, SQL tables

b. Unstructured Data : When each record of data does not follow any predefined format. Ex. Audio Clip, Videos

Type of Variables:

a. Continuous Variable: These variable can take infinite number of values between any two values. Ex. Temperature

b. Categorical : These variable can take only fixed category or values Ex. Gender

c. Discrete : Numerical variable that can take only countable number of values between any two values. Ex. Year

What is Model: Basically a model provides the relationship two variables

Basically Terminology for Model

a. Features : Independent variable that can be used in algorithm to predict target variable

b. Algorithm: Statistical template that can be used to create model

c. Hyperparameters : Parameters that can be used to enhance the performance of model

d. Loss functions : Works as a feedback to optimize the model

e. Train dataset : Part of the overall dataset that can be used to train the model

f. Test dataset : Part of the overall datatset that can be used to evaluate the performance the of the model

g. Validation dataset : Part of the train dataset that can be used to optimize the model performance

h. Target variable: The variable that we are trying to predict using the model.

--

--

Shivam Agarwal

Shivam is an accomplished analytics professional and algo trader, sharing expertise in algo trading, data science, and AI through insightful publications.