**21. What is bucketing in machine learning?**

Converting a (usually continuous) feature into multiple binary features called buckets or bins, typically based on value range. For example, instead of representing temperature as a single continuous floating-point feature, you could chop ranges of temperatures into discrete bins. Given temperature data sensitive to a tenth of a degree, all temperatures between 0.0 and 15.0 degrees could be put into one bin, 15.1 to 30.0 degrees could be the second bin, and 30.1 to 50.0 degrees could be a third bin.

**22. What are some methods of reducing dimensionality? **

You can reduce dimensionality by combining features with feature engineering, removing collinear features, or using algorithmic dimensionality reduction.

**23. How do classification and regression differ? **

Classification predicts group or class membership. Regression involves predicting a response. Classification is the better technique when you need a more definite answer.

**24. What is supervised versus unsupervised learning? **

Supervised learning is a process of machine learning in which outputs are fed back into a computer for the software to learn from for more accurate results the next time. With supervised learning, the “machine” receives initial training to start. In contrast, unsupervised learning means a computer will learn without initial training.

**25. Define A HashTable?**

They are generally used for database indexing. A hash table is nothing but a data structure that produces an associative array.

**26. What is the bias in machine learning?**

An interceptor offset from an origin. Bias (also known as the bias term) is referred to as b or w0 in machine learning models.

**27. What is the use of gradient descent?**

The use of gradient descent plainly lies with the fact that it is easy to implement and is compatible with most of the ML algorithms when it comes to optimization. This technique works on the principle of a cost function.

**28. What is backpropagation in machine learning?**

The primary algorithm for performing gradient descent on neural networks. First, the output values of each node are calculated (and cached) in a forward pass. Then, the partial derivative of the error with respect to each parameter is calculated in a backward pass through the graph.

The Area Under the ROC curve is the probability that a classifier will be more confident that a randomly chosen positive example is actually positive than that a randomly chosen negative example is positive.

**29. What is a sigmoid function in Machine learning?**

A function that maps logistic or multinomial regression output (log odds) to probabilities, returning a value between 0 and 1.

**30. Explain The Concept Of Machine Learning And Assume That You Are Explaining This To A 5-year-old Baby?**

Yes, Machine learning is exactly the same way how babies do their day to day activities, the way they walk or sleep etc. It is a common fact that babies cannot walk straight away and they fall and then they get up again and then try. This is the same thing when it comes to machine learning, it is all about how the algorithm is working and at the same time redefining every time to make sure the end result is as perfect as possible.