Supervised learning requires:Unlabeled dataLabeled dataNo dataOnly reinforcement signalsB) Labeled dataSupervised learning uses labeled datasets to train models.
Which algorithm is used for classification?K-meansKNNAprioriDBSCANB) KNNK-Nearest Neighbor is a popular classification algorithm.
Overfitting in ML means:Model performs poorly on training dataModel performs well on training data but poorly on test dataModel generalizes wellModel underfits dataB) Model performs well on training data but poorly on test dataOverfitting occurs when a model memorizes training data and fails to generalize.
Which technique is used to avoid overfitting?RegularizationOvertrainingNoise AdditionIgnoring validation dataA) RegularizationRegularization adds penalty to complex models to avoid overfitting.
Gradient Descent is used to:Maximize lossMinimize lossStop trainingCreate dataB) Minimize lossGradient Descent updates model parameters to minimize the loss function.
Which of the following is a supervised learning algorithm?K-Means ClusteringLinear RegressionApriori AlgorithmPCAB) Linear RegressionLinear Regression is a supervised algorithm used to predict continuous values.
Overfitting in machine learning happens when:Model is too simpleModel performs poorly on training dataModel learns noise and performs poorly on new dataDataset is too largeC) Model learns noise and performs poorly on new dataOverfitting occurs when a model memorizes training data instead of generalizing patterns.
In classification tasks, confusion matrix is used to measure:Execution timeAccuracy and error typesTraining costModel complexityB) Accuracy and error typesConfusion matrix shows TP, FP, TN, FN to evaluate classification performance.
Which machine learning algorithm is based on the concept of "margin maximization"?Decision TreesNaïve BayesSupport Vector MachinesRandom ForestC) Support Vector MachinesSVM maximizes the margin between classes to achieve better classification boundaries.
Which technique is used for dimensionality reduction?KNNPCABaggingBoostingB) PCAPrincipal Component Analysis (PCA) reduces feature dimensions while preserving variance.
Which of the following is used to avoid overfitting?Increasing model complexityUsing regularizationReducing training dataRemoving validation setB) Using regularizationRegularization techniques like L1/L2 add penalties to model complexity to avoid overfitting.
Bagging stands for:Bag AggregationBoosted AggregationBootstrap AggregationBinary AggregationC) Bootstrap AggregationBagging trains multiple models on random samples and combines their predictions to reduce variance.
Which metric is preferred for imbalanced classification datasets?AccuracyPrecision-Recall or F1-ScoreExecution TimeMean Squared ErrorB) Precision-Recall or F1-ScoreF1-Score balances precision and recall, making it suitable for imbalanced data.
Gradient Descent is used for:Sorting DataMinimizing Loss FunctionIncreasing Model ComplexitySampling DataB) Minimizing Loss FunctionGradient Descent updates parameters in the direction that reduces the cost function.
Which algorithm uses ensemble of decision trees?Naive BayesLogistic RegressionRandom ForestLinear SVM C) Random ForestRandom Forest combines multiple decision trees to improve accuracy and reduce overfitting.
Which technique reduces variance in ML models?BoostingBaggingUnderfittingRegularization OnlyB) BaggingBagging reduces variance by combining predictions of multiple models trained on bootstrapped data.
Cross-validation helps in:Model deploymentModel selection and performance estimationModel compressionModel encryptionB) Model selection and performance estimationCross-validation provides unbiased model evaluation on unseen data.
Which algorithm is used for dimensionality reduction?PCASVMLogistic RegressionKNNA) PCAPrincipal Component Analysis reduces dimensions while retaining maximum variance.
Which activation function outputs values between 0 and 1?ReLUTanhSigmoidSoftmaxC) SigmoidSigmoid function squashes input into range [0,1], useful for binary classification.
Overfitting occurs when:Model is too simpleModel performs well on training but poorly on test dataData is insufficientRegularization is appliedB) Model performs well on training but poorly on test dataOverfitting captures noise and fails to generalize to unseen data.
Which evaluation metric is preferred for imbalanced classification problems?AccuracyPrecision-Recall or F1-ScoreMean Squared ErrorR² ScoreB) Precision-Recall or F1-ScorePrecision and recall handle imbalanced class distribution better than accuracy.
Ensemble methods like Random Forest are based on:Combining multiple weak learnersUsing single strong learnerDimensionality reductionFeature scalingA) Combining multiple weak learnersRandom Forest uses multiple decision trees to improve prediction.
Which activation function outputs between 0 and 1?ReLUSigmoidTanhSoftmaxB) SigmoidSigmoid squashes values into range [0, 1].
Regularization helps to:Increase model complexityPrevent overfittingImprove training errorAdd more parametersB) Prevent overfittingRegularization penalizes large weights to generalize model.
Which algorithm is used for market basket analysis?AprioriDecision TreeSVMGradient DescentA) AprioriApriori discovers frequent itemsets and association rules.
Overfitting occurs when a model:Generalizes wellLearns training data too wellPerforms poorly on training dataHas too few parametersB) Learns training data too wellOverfitting fits noise in training data, reducing test performance.
Which of the following is a supervised learning task?ClusteringClassificationDimensionality ReductionAssociation Rule MiningB) ClassificationClassification uses labeled data to predict categories.
Gradient Descent is primarily used for:Hyperparameter tuningOptimization of loss functionsFeature engineeringData preprocessingB) Optimization of loss functionsGradient descent minimizes cost functions in ML models.
Dropout technique is applied to:Reduce underfittingReduce overfittingIncrease training errorImprove feature selectionB) Reduce overfittingDropout randomly ignores neurons to prevent co-adaptation.
Which kernel is commonly used in SVM?LinearPolynomialRBF (Radial Basis Function)All of the AboveD) All of the AboveSVM supports multiple kernels for nonlinear decision boundaries.
A confusion matrix is used in:ClusteringClassificationRegressionDimensionality reductionB) ClassificationConfusion matrix evaluates classification accuracy with TP, TN, FP, FN.
Ensemble learning combines:Multiple datasetsMultiple modelsMultiple featuresMultiple cost functionsB) Multiple modelsEnsemble learning improves performance by combining predictions of several models.
Which algorithm is unsupervised?Decision TreesK-MeansLogistic RegressionNaïve BayesB) K-MeansK-Means is an unsupervised clustering algorithm.
Dropout is a technique used to:Increase training timePrevent overfittingReduce featuresOptimize cost functionB) Prevent overfittingDropout randomly disables neurons to improve generalization.
Which ML algorithm is based on Bayes’ theorem?KNNNaïve BayesSVMDecision TreeB) Naïve BayesNaïve Bayes uses conditional probability for classification.
Overfitting occurs when:Model fits training data too wellModel generalizes wellTraining data is insufficientFeatures are irrelevantA) Model fits training data too wellOverfitting reduces performance on unseen data.
Which algorithm is used for classification?Linear RegressionLogistic RegressionK-meansPCAB) Logistic RegressionLogistic regression predicts categorical outcomes.
Ensemble learning improves accuracy by:Using more epochsCombining multiple modelsReducing dataset sizeRemoving featuresB) Combining multiple modelsEnsemble methods like bagging and boosting increase robustness.
Gradient Descent minimizes:AccuracyLoss functionFeaturesLearning rateB) Loss functionGradient descent iteratively reduces the error.
Dimensionality reduction is performed by:SVMPCAKNNNaive BayesB) PCAPrincipal Component Analysis reduces features while retaining variance.
Which ML algorithm is lazy learning?SVMDecision TreeKNNNaive BayesC) KNNKNN stores training data and delays computation until query
Reinforcement learning is based on:Labeled dataRewards and penaltiesClusteringRegressionB) Rewards and penaltiesRL agents learn through feedback from actions.
Random Forest is an example of:Regression onlyEnsemble methodDimensionality reductionClusteringB) Ensemble methodRandom forest combines multiple decision trees.
Dropout in neural networks is used for:Increasing training speedPreventing overfittingData normalizationActivation functionB) Preventing overfittingDropout randomly ignores neurons during training.
Which activation function is widely used in hidden layers?SigmoidReLUStepLinearB) ReLUReLU speeds training and avoids vanishing gradients.
Overfitting means:Good on test, poor on trainGood on train, poor on testLow varianceNo errorsB) Good on train, poor on testModel memorizes instead of generalizing.
Which is not supervised?Linear regressionDecision treeK-meansRandom forestC) K-meansK-means is unsupervised clustering.
PCA is used for:ClusteringDimensionality reductionClassificationRegressionB) Dimensionality reductionPCA reduces features while preserving variance.
Gradient descent updates parameters by moving:Towards increasing errorTowards decreasing errorRandomlyAway from minimumB) Towards decreasing errorIt minimizes cost function.
Label encoding is applied to:Numerical dataCategorical dataMissing valuesOutliersB) Categorical dataConverts categories into numbers.
Which is a supervised learning algorithm?K-MeansLinear RegressionAprioriPCAB) Linear RegressionLinear regression learns from labeled training data.
Overfitting occurs when:Model performs well on training but poorly on test dataModel performs well on both training and test dataModel performs poorly on training dataModel ignores test dataA) Model performs well on training but poorly on test dataOverfitting captures noise and fails to generalize.
Which evaluation metric is best for imbalanced datasets?AccuracyPrecision & RecallLoss functionMSEB) Precision & RecallPrecision and recall better handle class imbalance.
Which algorithm is used for dimensionality reduction?PCAKNNDecision TreeLogistic RegressionA) PCAPrincipal Component Analysis reduces features.
Gradient Descent is used for:Data preprocessingOptimizing model parametersFeature selectionRegularizationB) Optimizing model parametersGradient Descent minimizes loss functions.
Which algorithm is suitable for text classification?Naïve BayesK-MeansAprioriDBSCANA) Naïve BayesNaïve Bayes works well with high-dimensional text data.
Which method is commonly used to prevent overfitting?Data augmentationRegularizationCross-validationAll of the aboveD) All of the aboveAll these methods improve model generalization.
In decision trees, Gini index and entropy are used to measure:OverfittingNode impurityData scalingAccuracyB) Node impurityThey help decide the best attribute for splitting.
Which algorithm is non-parametric?Logistic RegressionKNNLinear RegressionLogistic RegressionB) KNNKNN does not assume any fixed data distribution.
The universal approximation theorem is associated with:Decision treesNeural networksSVMRandom ForestsB) Neural networksNeural networks can approximate any function given enough neurons.
Which is a supervised learning algorithm?K-meansPCADecision TreeAprioriC) Decision TreeDecision trees learn from labeled training data.
Overfitting in ML happens when:Model is too simpleModel memorizes training dataData is missingTraining set is too largeB) Model memorizes training dataOverfitting reduces generalization to unseen data.
Which evaluation metric is best for imbalanced datasets?AccuracyPrecision-RecallExecution timeMSEB) Precision-RecallPrecision and recall handle class imbalance better than accuracy.
Gradient Descent is used for:Model evaluationOptimizationData preprocessingFeature extractionB) OptimizationGradient Descent minimizes cost functions in ML models.
Which algorithm is inspired by the brain’s neuron structure?Random ForestSVMNeural NetworksDecision TreesC) Neural NetworksNeural networks simulate biological neurons.
Which ML algorithm is mainly used for classification tasks?K-meansLogistic RegressionAprioriK-means++B) Logistic RegressionLogistic regression predicts categorical outcomes.
Which algorithm is based on “nearest neighbors” concept?Decision TreeKNNSVMNaïve BayesB) KNNKNN classifies data based on nearest labeled neighbors.
Which evaluation metric is best for imbalanced datasets?AccuracyPrecision & RecallMean Squared ErrorVarianceB) Precision & RecallFor imbalanced data, precision and recall are more reliable than accuracy.
Which algorithm is used for classification?K-meansDecision TreeAprioriPCAB) Decision TreeDecision Trees classify data into categories.
Which library is widely used for ML in Python?NumPyPandasScikit-learnMatplotlibC) Scikit-learnScikit-learn has ML algorithms.