Posts

Showing posts from 2021

"""What does the Normalization & Standardization of Data basically means?"""

Image
    Normalization or standardization is defined as the process of rescaling original data without changing its behavior or nature. They are both part of preprocessing. What is N ormalization ? Normalization is one of the technique used in Data pre- procession. We define new boundary (mostly 0,1) and convert data accordingly. This technique is useful in classification algorithms involving neural network or distance based algorithm (e.g. KNN, K-means). It is also known as Min-Max scaling. Why is normalization important? Let’s understand it by an example. Suppose we are making some predictive model using dataset that contains the net worth of citizens of a country. For this dataset we find that there is large variation in data. If we feed this data to train any model, then it may generate some undesirable results. So, to get rid of that we opt normalization. What is Standardization? Data  standardization  is the process of rescaling one or more attributes so t...

"""Clustering & Clustering Algorithms"""

Image
What is Clustering?  The method of identifying similar groups of data in a dataset is called clustering. It is one of the most popular techniques in Data Analysis. It is Unsupervised Clustering Algorithm. Simple Case where clustering can be useful: Imagine, you own shop and want to understand preferences of your costumers to improve profit margin. It is not possible to look at details, buying habits, patterns of each consumer and plan a separate business strategy for each  of them. Instead of that you can cluster all of your consumers into say 5 groups depending on their purchasing habits and use a separate strategy for consumers in each of these 10 groups. This is what called as clustering. There are so many clustering algorithms. But we will see popular algorithms among them K-means CLustering: The following diagram shows K means clustering operation on mixed  data points. K-means clustering follows partitioning & observations in k clusters appr...

"""Don't get confused between Linear Regression & Logistic Regression.""""

Image
Linear Regression: In above image the dependent variable is on Y-axis (salary) and independent variable is on x-axis(experience). The regression line can be written as y= a 0 +a 1 x+ ε Where, a 0  and a 1  are the coefficients and ε is the error term. Line of best fit or regression line is found when Sum of the square of residuals ( ∑ (Y-h(X))2) is minimum Test Error depends on the test data. If the Test data is an exact representation of train data then test error is always zero. In simple linear regression, there is one independent variable and 2 coefficient are needed(Y=a+bx+error). Linear regression provides a continuous output (dependent variable ). The output for Linear Regression must be a continuous value, such as price, age, etc. Logistic regression: Logistic regression is a classification algorithm  Logistic regression is a predictive analysis. Logistic regression is used to  describe data and to explain the rela...

"""Machine Learning :Supervised Learning Vs Unsupervised Learning"""

Image
Supervised learning Note: Supervised  learning  allows you to collect data or produce a data output from the previous experience. 1 . Supervised learning is a machine learning method  in which models are trained using labeled data.                                                                      Y = f(X) 2. In supervised learning, input data (X) is provided to the  model along with the output (Y). 3. Supervised learning model takes direct feedback 4. The goal of supervised learning is to train the model so that  it can predict the output when it is given new data. 5.  Classification and Regression problems come under supervised learning. Examples of Supervised learning algorithms: Linear Regression, Logistic Regression, Support Vector Machine, Multi-class Classification,...

""What is the difference between Data Analytics & Business Analytics?"""

Image
This is the favorite question among freshers who are seeking the opportunities in analytics sector. There is significant difference between Data Analytics & Business Analytics. So  Both Business analysts and data analysts work with data. The difference is that what they do with the Data. Business Analyst VS Data Analytics 1 . Business analysts use data to detect what is business problems and find solutions according to it.                      Data analysts on other source data, cleanse it,  manipulate it, identify useful insights from it, and make conclusions out of it. Analyzing data is Data Analyst's last goal.  2.   Business analysts  do not perform a deep technical analysis of the data.  Business analysts   work  strategically. They communicate with stakeholders clients and are concerned only with business aspects of data         ...