MCQ Question of Machine learning
 What is Machine Learning (ML)?
 The autonomous acquisition of knowledge through the use of manual programs
 The selective acquisition of knowledge through the use of computer programs
 The selective acquisition of knowledge through the use of manual programs
 The autonomous acquisition of knowledge through the use of computer programs
Correct option is D
 Father of Machine Learning (ML)
 Geoffrey Chaucer
 Geoffrey Hill
 Geoffrey Everest Hinton
 None of the above
Correct option is C
 Which is FALSE regarding regression?
 It may be used for interpretation
 It is used for prediction
 It discovers causal relationships
 It relates inputs to outputs
Correct option is C
 Choose the correct option regarding machine learning (ML) and artificial intelligence (AI)
 ML is a set of techniques that turns a dataset into a software
 AI is a software that can emulate the human mind
 ML is an alternate way of programming intelligent machines
 All of the above
Correct option is D
 Which of the factors affect the performance of the learner system does not include?
 Good data structures
 Representation scheme used
 Training scenario
 Type of feedback
Correct option is A
 In general, to have a welldefined learning problem, we must identity which of the following
 The class of tasks
 The measure of performance to be improved
 The source of experience
 All of the above
Correct option is D
 Successful applications of ML
 Learning to recognize spoken words
 Learning to drive an autonomous vehicle
 Learning to classify new astronomical structures
 Learning to play worldclass backgammon
 All of the above
Correct option is E
 Which of the following does not include different learning methods
 Analogy
 Introduction
 Memorization
 Deduction
Correct option is B
 In language understanding, the levels of knowledge that does not include?
 Empirical
 Logical
 Phonological
 Syntactic
Correct option is A
 Designing a machine learning approach involves:
 Choosing the type of training experience
 Choosing the target function to be learned
 Choosing a representation for the target function
 Choosing a function approximation algorithm
 All of the above
Correct option is E
 Concept learning inferred a valued function from training examples of its input and output.
 Decimal
 Hexadecimal
 Boolean
 All of the above
Correct option is C
 Which of the following is not a supervised learning?
 Naive Bayesian
 PCA
 Linear Regression
 Decision Tree Answer
Correct option is B
 What is Machine Learning?
 Artificial Intelligence
 Deep Learning
 Data Statistics
 Only (i)
 (i) and (ii)
 All
 None
Correct option is B
 What kind of learning algorithm for “Facial identities or facial expressions”?
 Prediction
 Recognition Patterns
 Generating Patterns
 Recognizing Anomalies Answer
Correct option is B
 Which of the following is not type of learning?
 Unsupervised Learning
 Supervised Learning
 Semiunsupervised Learning
 Reinforcement Learning
Correct option is C
 RealTime decisions, Game AI, Learning Tasks, Skill Aquisition, and Robot Navigation are applications of which of the folowing
 Supervised Learning: Classification
 Reinforcement Learning
 Unsupervised Learning: Clustering
 Unsupervised Learning: Regression
Correct option is B
 Targetted marketing, Recommended Systems, and Customer Segmentation are applications in which of the following
 Supervised Learning: Classification
 Unsupervised Learning: Clustering
 Unsupervised Learning: Regression
 Reinforcement Learning
Correct option is B
 Fraud Detection, Image Classification, Diagnostic, and Customer Retention are applications in which of the following
 Unsupervised Learning: Regression
 Supervised Learning: Classification
 Unsupervised Learning: Clustering
 Reinforcement Learning
Correct option is B
 Which of the following is not function of symbolic in the various function representation of Machine Learning?
 Rules in propotional Logic
 HiddenMarkov Models (HMM)
 Rules in firstorder predicate logic
 Decision Trees
Correct option is B
 Which of the following is not numerical functions in the various function representation of Machine Learning?
 Neural Network
 Support Vector Machines
 Casebased
 Linear Regression
Correct option is C
 FINDS Algorithm starts from the most specific hypothesis and generalize it by considering only
 Negative
 Positive
 Negative or Positive
 None of the above
Correct option is B
 FINDS algorithm ignores
 Negative
 Positive
 Both
 None of the above
Correct option is A
 The CandidateElimination Algorithm represents the .
 Solution Space
 Version Space
 Elimination Space
 All of the above
Correct option is B
 Inductive learning is based on the knowledge that if something happens a lot it is likely to be generally
 True
 False Answer
Correct option is A
 Inductive learning takes examples and generalizes rather than starting with
 Inductive
 Existing
 Deductive
 None of these
Correct option is B
 A drawback of the FINDS is that it assumes the consistency within the training set
 True
 False
Correct option is A
 What strategies can help reduce overfitting in decision trees?
 Enforce a maximum depth for the tree
 Enforce a minimum number of samples in leaf nodes
 Pruning
 Make sure each leaf node is one pure class
 All
 (i), (ii) and (iii)
 (i), (iii), (iv)
 None
Correct option is B
 Which of the following is a widely used and effective machine learning algorithm based on the idea of bagging?
 Decision Tree
 Random Forest
 Regression
 Classification
Correct option is B
 To find the minimum or the maximum of a function, we set the gradient to zero because which of the following
 Depends on the type of problem
 The value of the gradient at extrema of a function is always zero
 Both (A) and (B)
 None of these
Correct option is B
 Which of the following is a disadvantage of decision trees?
 Decision trees are prone to be overfit
 Decision trees are robust to outliers
 Factor analysis
 None of the above
Correct option is A
 What is perceptron?
 A single layer feedforward neural network with preprocessing
 A neural network that contains feedback
 A double layer autoassociative neural network
 An autoassociative neural network
Correct option is A
 Which of the following is true for neural networks?
 The training time depends on the size of the
 Neural networks can be simulated on a conventional
 Artificial neurons are identical in operation to biological
 All
 Only (ii)
 (i) and (ii)
 None
Correct option is C
 What are the advantages of neural networks over conventional computers?
 They have the ability to learn by
 They are more fault
 They are more suited for real time operation due to their high „computational‟
 (i) and (ii)
 (i) and (iii)
 Only (i)
 All
 None
Correct option is D
 What is Neuro software?
 It is software used by Neurosurgeon
 Designed to aid experts in real world
 It is powerful and easy neural network
 A software used to analyze neurons
Correct option is C
 Which is true for neural networks?
 Each node computes it‟s weighted input
 Node could be in excited state or nonexcited state
 It has set of nodes and connections
 All of the above
Correct option is D
 What is the objective of backpropagation algorithm?
 To develop learning algorithm for multilayer feedforward neural network, so that network can be trained to capture the mapping implicitly
 To develop learning algorithm for multilayer feedforward neural network
 To develop learning algorithm for single layer feedforward neural network
 All of the above
Correct option is A
 Which of the following is true?
Single layer associative neural networks do not have the ability to:
 Perform pattern recognition
 Find the parity of a picture
 Determine whether two or more shapes in a picture are connected or not
 (ii) and (iii)
 Only (ii)
 All
 None
Correct option is A
 The backpropagation law is also known as generalized delta rule
 True
 False
Correct option is A
 Which of the following is true?
 On average, neural networks have higher computational rates than conventional computers.
 Neural networks learn by
 Neural networks mimic the way the human brain
 All
 (ii) and (iii)
 (i), (ii) and (iii)
 None
Correct option is A
 What is true regarding backpropagation rule?
 Error in output is propagated backwards only to determine weight updates
 There is no feedback of signal at nay stage
 It is also called generalized delta rule
 All of the above
Correct option is D
 There is feedback in final stage of backpropagation
 True
 False
Correct option is B
 An autoassociative network is
 A neural network that has only one loop
 A neural network that contains feedback
 A single layer feedforward neural network with preprocessing
 A neural network that contains no loops
Correct option is B
 A 3input neuron has weights 1, 4 and 3. The transfer function is linear with the constant of proportionality being equal to 3. The inputs are 4, 8 and 5 respectively. What will be the output?
 139
 153
 162
 160
Correct option is B
 What of the following is true regarding backpropagation rule?
 Hidden layers output is not all important, they are only meant for supporting input and output layers
 Actual output is determined by computing the outputs of units for each hidden layer
 It is a feedback neural network
 None of the above
Correct option is B
 What is back propagation?
 It is another name given to the curvy function in the perceptron
 It is the transmission of error back through the network to allow weights to be adjusted so that the network can learn
 It is another name given to the curvy function in the perceptron
 None of the above
Correct option is B
 The general limitations of back propagation rule is/are
 Scaling
 Slow convergence
 Local minima problem
 All of the above
Correct option is D
 What is the meaning of generalized in statement “backpropagation is a generalized delta rule” ?
 Because delta is applied to only input and output layers, thus making it more simple and generalized
 It has no significance
 Because delta rule can be extended to hidden layer units
 None of the above
Correct option is C
 Neural Networks are complex functions with many parameter
 Linear
 Non linear
 Discreate
 Exponential
Correct option is A
 The general tasks that are performed with backpropagation algorithm
 Pattern mapping
 Prediction
 Function approximation
 All of the above
Correct option is D
 Backpropagaion learning is based on the gradient descent along error surface.
 True
 False
Correct option is A
 In backpropagation rule, how to stop the learning process?
 No heuristic criteria exist
 On basis of average gradient value
 There is convergence involved
 None of these
Correct option is B
 Applications of NN (Neural Network)
 Risk management
 Data validation
 Sales forecasting
 All of the above
Correct option is D
 The network that involves backward links from output to the input and hidden layers is known as
 Recurrent neural network
 Self organizing maps
 Perceptrons
 Single layered perceptron
Correct option is A
 Decision Tree is a display of an Algorithm?
 True
 False
Correct option is A
 Which of the following is/are the decision tree nodes?
 End Nodes
 Decision Nodes
 Chance Nodes
 All of the above
Correct option is D
 End Nodes are represented by which of the following
 Solar street light
 Triangles
 Circles
 Squares
Correct option is B
 Decision Nodes are represented by which of the following
 Solar street light
 Triangles
 Circles
 Squares
Correct option is D
 Chance Nodes are represented by which of the following
 Solar street light
 Triangles
 Circles
 Squares
Correct option is C
 Advantage of Decision Trees
 Possible Scenarios can be added
 Use a white box model, if given result is provided by a model
 Worst, best and expected values can be determined for different scenarios
 All of the above
Correct option is D
 terms are required for building a bayes model.
 1
 2
 3
 4
Correct option is C
 Which of the following is the consequence between a node and its predecessors while creating bayesian network?
 Conditionally independent
 Functionally dependent
 Both Conditionally dependant & Dependant
 Dependent
Correct option is A
 Why it is needed to make probabilistic systems feasible in the world?
 Feasibility
 Reliability
 Crucial robustness
 None of the above
Correct option is C
 Bayes rule can be used for:
 Solving queries
 Increasing complexity
 Answering probabilistic query
 Decreasing complexity
Correct option is C
 provides way and means of weighing up the desirability of goals and the likelihood of achieving
 Utility theory
 Decision theory
 Bayesian networks
 Probability theory
Correct option is A
 Which of the following provided by the Bayesian Network?
 Complete description of the problem
 Partial description of the domain
 Complete description of the domain
 All of the above
Correct option is C
65. Probability provides a way of summarizing the that comes from our laziness and
 Belief
 Uncertaintity
 Joint probability distributions
 Randomness
Correct option is B
 The entries in the full joint probability distribution can be calculated as
 Using variables
 Both Using variables & information
 Using information
 All of the above
Correct option is C
 Causal chain (For example, Smoking cause cancer) gives rise to:
 Conditionally Independence
 Conditionally Dependence
 Both
 None of the above
Correct option is A
 The bayesian network can be used to answer any query by using:
 Full distribution
 Joint distribution
 Partial distribution
 All of the above
Correct option is B
 Bayesian networks allow compact specification of:
 Joint probability distributions
 Belief
 Propositional logic statements
 All of the above
Correct option is A
 The compactness of the bayesian network can be described by
 Fully structured
 Locally structured
 Partially structured
 All of the above
Correct option is B
 The ExpectationMaximization Algorithm has been used to identify conserved domains in unaligned proteins only. State True or False.
 True
 False
Correct option is B
 Which of the following is correct about the Naive Bayes?
 Assumes that all the features in a dataset are independent
 Assumes that all the features in a dataset are equally important
 Both
 All of the above
Correct option is C
 Which of the following is false regarding EM Algorithm?
 The alignment provides an estimate of the base or amino acid composition of each column in the site
 The columnbycolumn composition of the site already available is used to estimate the probability of finding the site at any position in each of the sequences
 The rowbycolumn composition of the site already available is used to estimate the probability
 None of the above
Correct option is C
 Naïve Bayes Algorithm is a learning algorithm.
 Supervised
 Reinforcement
 Unsupervised
 None of these
Correct option is A
 EM algorithm includes two repeated steps, here the step 2 is .
 The normalization
 The maximization step
 The minimization step
 None of the above
Correct option is C
 Examples of Naïve Bayes Algorithm is/are
 Spam filtration
 Sentimental analysis
 Classifying articles
 All of the above
Correct option is D
 In the intermediate steps of “EM Algorithm”, the number of each base in each column is determined and then converted to
 True
 False
Correct option is A
 Naïve Bayes algorithm is based on and used for solving classification problems.
 Bayes Theorem
 Candidate elimination algorithm
 EM algorithm
 None of the above
Correct option is A
 Types of Naïve Bayes Model:
 Gaussian
 Multinomial
 Bernoulli
 All of the above
Correct option is D
 Disadvantages of Naïve Bayes Classifier:
 Naive Bayes assumes that all features are independent or unrelated, so it cannot learn the relationship between
 It performs well in Multiclass predictions as compared to the other
 Naïve Bayes is one of the fast and easy ML algorithms to predict a class of
 It is the most popular choice for text classification problems.
Correct option is A
 The benefit of Naïve Bayes:
 Naïve Bayes is one of the fast and easy ML algorithms to predict a class of
 It is the most popular choice for text classification problems.
 It can be used for Binary as well as Multiclass
 All of the above
Correct option is D
 In which of the following types of sampling the information is carried out under the opinion of an expert?
 Convenience sampling
 Judgement sampling
 Quota sampling
 Purposive sampling
Correct option is B
 Full form of MDL?
 Minimum Description Length
 Maximum Description Length
 Minimum Domain Length
 None of these
Correct option is A
 For the analysis of ML algorithms, we need
 Computational learning theory
 Statistical learning theory
 Both A & B
 None of these
Correct option is C
 PAC stand for
 Probably Approximate Correct
 Probably Approx Correct
 Probably Approximate Computation
 Probably Approx Computation
Correct option is A
86. hypothesis h with respect to target concept c and distribution D , is the probability that h will misclassify an instance drawn at random according to D.
 True Error
 Type 1 Error
 Type 2 Error
 None of these
Correct option is A
 Statement: True error defined over entire instance space, not just training data
 True
 False
Correct option is A
 What are the area CLT comprised of?
 Sample Complexity
 Computational Complexity
 Mistake Bound
 All of these
Correct option is D
 What area of CLT tells “How many examples we need to find a good hypothesis ?”?
 Sample Complexity
 Computational Complexity
 Mistake Bound
 None of these
Correct option is A
 What area of CLT tells “How much computational power we need to find a good hypothesis ?”?
 Sample Complexity
 Computational Complexity
 Mistake Bound
 None of these
Correct option is B
 What area of CLT tells “How many mistakes we will make before finding a good hypothesis ?”?
 Sample Complexity
 Computational Complexity
 Mistake Bound
 None of these
Correct option is C
 (For question no. 9 and 10) Can we say that concept described by conjunctions of Boolean literals are PAC learnable?
 Yes
 No
Correct option is A
 How large is the hypothesis space when we have n Boolean attributes?
 H = 3 ^{n}
 H = 2 ^{n}
 H = 1 ^{n}
 H = 4^{n}
Correct option is A
 The VC dimension of hypothesis space H1 is larger than the VC dimension of hypothesis space H2. Which of the following can be inferred from this?
 The number of examples required for learning a hypothesis in H1 is larger than the number of examples required for H2
 The number of examples required for learning a hypothesis in H1 is smaller than the number of examples required for
 No relation to number of samples required for PAC learning.
Correct option is A
 For a particular learning task, if the requirement of error parameter changes from 0.1 to 0.01. How many more samples will be required for PAC learning?
 Same
 2 times
 1000 times
 10 times
Correct option is D
 Computational complexity of classes of learning problems depends on which of the following?
 The size or complexity of the hypothesis space considered by learner
 The accuracy to which the target concept must be approximated
 The probability that the learner will output a successful hypothesis
 All of these
Correct option is D
 The instancebased learner is a
 Lazylearner
 Eager learner
 Can‟t say
Correct option is A
 When to consider nearest neighbour algorithms?
 Instance map to point in k^{n}
 Not more than 20 attributes per instance
 Lots of training data
 None of these
 A, B & C
Correct option is E
 What are the advantages of Nearest neighbour alogo?
 Training is very fast
 Can learn complex target functions
 Don‟t lose information
 All of these
Correct option is D
 What are the difficulties with knearest neighbour algo?
 Calculate the distance of the test case from all training cases
 Curse of dimensionality
 Both A & B
 None of these
Correct option is C
 What if the target function is real valued in kNN algo?
 Calculate the mean of the k nearest neighbours
 Calculate the SD of the k nearest neighbour
 None of these
Correct option is A
 What is/are true about Distanceweighted KNN?
 The weight of the neighbour is considered
 The distance of the neighbour is considered
 Both A & B
 None of these
Correct option is C
 What is/are advantage(s) of Distanceweighted kNN over kNN?
 Robust to noisy training data
 Quite effective when a sufficient large set of training data is provided
 Both A & B
 None of these
Correct option is C
 What is/are advantage(s) of Locally Weighted Regression?
 Pointwise approximation of complex target function
 Earlier data has no influence on the new ones
 Both A & B
 None of these
Correct option is C
 The quality of the result depends on (LWR)
 Choice of the function
 Choice of the kernel function K
 Choice of the hypothesis space H
 All of these
Correct option is D
 How many types of layer in radial basis function neural networks?
 3
 2
 1
 4
Correct option is A, Input layer, Hidden layer, and Output layer
 The neurons in the hidden layer contains Gaussian transfer function whose output are to the distance from the centre of the neuron.
 Directly
 Inversely
 equal
 None of these
Correct option is B
 PNN/GRNN networks have one neuron for each point in the training file, While RBF network have a variable number of neurons that is usually
 less than the number of training
 greater than the number of training points
 equal to the number of training points
 None of these
Correct option is A
 Which network is more accurate when the size of training set between small to medium?
 PNN/GRNN
 RBF
 Kmeans clustering
 None of these
Correct option is A
 What is/are true about RBF network?
 A kind of supervised learning
 Design of NN as curve fitting problem
 Use of multidimensional surface to interpolate the test data
 All of these
Correct option is D
 Application of CBR
 Design
 Planning
 Diagnosis
 All of these
Correct option is A
 What is/are advantages of CBR?
 A local approx. is found for each test case
 Knowledge is in a form understandable to human
 Fast to train
 All of these
Correct option is D
112 In kNN algorithm, given a set of training examples and the value of k < size of training set (n), the algorithm predicts the class of a test example to be the. What is/are advantages of CBR?
 Least frequent class among the classes of k closest training
 Most frequent class among the classes of k closest training
 Class of the closest
 Most frequent class among the classes of the k farthest training examples.
Correct option is B
 Which of the following statements is true about PCA?
 We must standardize the data before applying
 We should select the principal components which explain the highest variance
 We should select the principal components which explain the lowest variance
 We can use PCA for visualizing the data in lower dimensions
 (i), (ii) and (iv).
 (ii) and (iv)
 (iii) and (iv)
 (i) and (iii)
Correct option is A
 Genetic algorithm is a
 Search technique used in computing to find true or approximate solution to optimization and search problem
 Sorting technique used in computing to find true or approximate solution to optimization and sort problem
 Both A & B
 None of these
Correct option is A
 GA techniques are inspired by
 Evolutionary
 Cytology
 Anatomy
 Ecology
Correct option is A
 When would the genetic algorithm terminate?
 Maximum number of generations has been produced
 Satisfactory fitness level has been reached for the
 Both A & B
 None of these
Correct option is C
 The algorithm operates by iteratively updating a pool of hypotheses, called the
 Population
 Fitness
 None of these
Correct option is A
 What is the correct representation of GA?
 GA(Fitness, Fitness_threshold, p)
 GA(Fitness, Fitness_threshold, p, r )
 GA(Fitness, Fitness_threshold, p, r, m)
 GA(Fitness, Fitness_threshold)
Correct option is C
 Genetic operators includes
 Crossover
 Mutation
 Both A & B
 None of these
Correct option is C
 Produces two new offspring from two parent string by copying selected bits from each parent is called
 Mutation
 Inheritance
 Crossover
 None of these
Correct option is C
 Each schema the set of bit strings containing the indicated as
 0s, 1s
 only 0s
 only 1s
 0s, 1s, *s
Correct option is D
 0*10 represents the set of bit strings that includes exactly (A) 0010, 0110
 0010, 0010
 0100, 0110
 0100, 0010
Correct option is A
 Correct ( h ) is the percent of all training examples correctly classified by hypothesis then Fitness function is equal to
 Fitness ( h) = (correct ( h)) ^{2}
 Fitness ( h) = (correct ( h)) ^{3}
 Fitness ( h) = (correct ( h))
 Fitness ( h) = (correct ( h)) ^{4}
Correct option is A
 Statement: Genetic Programming individuals in the evolving population are computer programs rather than bit
 True
 False
Correct option is A
 evolution over many generations was directly influenced by the experiences of individual organisms during their lifetime
 Baldwin
 Lamarckian
 Bayes
 None of these
Correct option is B
 Search through the hypothesis space cannot be characterized. Why?
 Hypotheses are created by crossover and mutation operators that allow radical changes between successive generations
 Hypotheses are not created by crossover and mutation
 None of these
Correct option is A
 ILP stand for
 Inductive Logical programming
 Inductive Logic Programming
 Inductive Logical Program
 Inductive Logic Program
Correct option is B
 What is/are the requirement for the LearnOneRule method?
 Input, accepts a set of +ve and ve training examples.
 Output, delivers a single rule that covers many +ve examples and few ve.
 Output rule has a high accuracy but not necessarily a high
 A & B
 A, B & C
Correct option is E
 is any predicate (or its negation) applied to any set of terms.
 Literal
 Null
 Clause
 None of these
Correct option is A
 Ground literal is a literal that
 Contains only variables
 does not contains any functions
 does not contains any variables
 Contains only functions Answer
Correct option is C
 emphasizes learning feedback that evaluates the learner’s performance without providing standards of correctness in the form of behavioural
 Reinforcement learning
 Supervised Learning
 None of these
Correct option is A
 Features of Reinforcement learning
 Set of problem rather than set of techniques
 RL is training by reward and
 RL is learning from trial and error with the
 All of these
Correct option is D
 Which type of feedback used by RL?
 Purely Instructive feedback
 Purely Evaluative feedback
 Both A & B
 None of these
Correct option is B
 What is/are the problem solving methods for RL?
 Dynamic programming
 Monte Carlo Methods
 Temporaldifference learning
 All of these
Correct option is D
 The FINDS Algorithm
 Starts with starts from the most specific hypothesis Answer
 It considers negative examples
 It considers both negative and positive
 None of these Correct
136. The hypothesis space has a generaltospecific ordering of hypotheses, and the search can be efficiently organized by taking advantage of a naturally occurring structure over the hypothesis space

 TRUE
 FALSE
Correct option is A
137. The Version space is:
 The subset of all hypotheses is called the version space with respect to the hypothesis space H and the training examples D, because it contains all plausible versions of the target
 The version space consists of only specific
 None of these
Correct option is A
 The CandidateElimination Algorithm
 The key idea in the CandidateElimination algorithm is to output a description of the set of all hypotheses consistent with the training
 CandidateElimination algorithm computes the description of this set without explicitly enumerating all of its
 This is accomplished by using the moregeneralthan partial ordering and maintaining a compact representation of the set of consistent
 All of these
Correct option is D
 Concept learning is basically acquiring the definition of a general category from given sample positive and negative training examples of the
 TRUE
 FALSE
Correct option is A
 The hypothesis h1 is moregeneralthan hypothesis h2 ( h1 > h2) if and only if h1≥h2 is true and h2≥h1 is false. We also say h2 is morespecificthan h1
 The statement is true
 The statement is false
 We cannot
 None of these
Correct option is A
 The ListThenEliminate Algorithm
 The ListThenEliminate algorithm initializes the version space to contain all hypotheses in H, then eliminates any hypothesis found inconsistent with any training
 The ListThenEliminate algorithm not initializes to the version
 None of these Answer
Correct option is A
 What will take place as the agent observes its interactions with the world?
 Learning
 Hearing
 Perceiving
 Speech
Correct option is A
 Which modifies the performance element so that it makes better decision?Performance element
 Performance element
 Changing element
 Learning element
 None of the mentioned
Correct option is C
 Any hypothesis found to approximate the target function well over a sufficiently large set of training examples will also approximate the target function well over other unobserved example is called:
 Inductive Learning Hypothesis
 Null Hypothesis
 Actual Hypothesis
 None of these
Correct option is A
 Feature of ANN in which ANN creates its own organization or representation of information it receives during learning time is
 Adaptive Learning
 Self Organization
 WhatIf Analysis
 Supervised Learning
Correct option is B
 How the decision tree reaches its decision?
 Single test
 Two test
 Sequence of test
 No test
Correct option is C
 Which of the following is a disadvantage of decision trees?
 Factor analysis
 Decision trees are robust to outliers
 Decision trees are prone to be overfit
 None of the above
Correct option is C
 Tree/Rule based classification algorithms generate which rule to perform the classification.
 ifthen.
 then
 do
 Answer
Correct option is A
 What is Gini Index?
 It is a type of index structure
 It is a measure of purity
 None of the options
Correct option is A
 What is not a RNN in machine learning?
 One output to many inputs
 Many inputs to a single output
 RNNs for nonsequential input
 Many inputs to many outputs
Correct option is A
 Which of the following sentences are correct in reference to Information gain?
 It is biased towards multivalued attributes
 ID3 makes use of information gain
 The approach used by ID3 is greedy
 All of these
Correct option is D
 A Neural Network can answer
 For Loop questions
 whatif questions
 IFTheElse Analysis Questions
 None of these Answer
Correct option is B
 Artificial neural network used for
 Pattern Recognition
 Classification
 Clustering
 All Answer
Correct option is D
 Which of the following are the advantage/s of Decision Trees?
 Possible Scenarios can be added
 Use a white box model, If given result is provided by a model
 Worst, best and expected values can be determined for different scenarios
 All of the mentioned
Correct option is D
 What is the mathematical likelihood that something will occur?
 Classification
 Probability
 Naïve Bayes Classifier
 None of the other
Correct option is C
 What does the Bayesian network provides?
 Complete description of the domain
 Partial description of the domain
 Complete description of the problem
 None of the mentioned
Correct option is C
 Where does the Bayes rule can be used?
 Solving queries
 Increasing complexity
 Decreasing complexity
 Answering probabilistic query
Correct option is D
 How many terms are required for building a Bayes model?
 2
 3
 4
 1
Correct option is B
 What is needed to make probabilistic systems feasible in the world?
 Reliability
 Crucial robustness
 Feasibility
 None of the mentioned
Correct option is B
 It was shown that the Naive Bayesian method
 Can be much more accurate than the optimal Bayesian method
 Is always worse off than the optimal Bayesian method
 Can be almost optimal only when attributes are independent
 Can be almost optimal when some attributes are dependent
Correct option is C
 What is the consequence between a node and its predecessors while creating Bayesian network?
 Functionally dependent
 Dependant
 Conditionally independent
 Both Conditionally dependant & Dependant
Correct option is C
 How the compactness of the Bayesian network can be described?
 Locally structured
 Fully structured
 Partial structure
 All of the mentioned
Correct option is A
 How the entries in the full joint probability distribution can be calculated?
 Using variables
 Using information
 Both Using variables & information
 None of the mentioned
Correct option is B
 How the Bayesian network can be used to answer any query?
 Full distribution
 Joint distribution
 Partial distribution
 All of the mentioned
Correct option is B
 Sample Complexity is
 The sample complexity is the number of trainingsamples that we need to supply to the algorithm, so that the function returned by the algorithm is within an arbitrarily small error of the best possible function, with probability arbitrarily close to 1
 How many training examples are needed for learner to converge to a successful hypothesis.
 All of these
Correct option is C
 PAC stands for
 Probability Approximately Correct
 Probability Applied Correctly
 Partition Approximately Correct
Correct option is A
 Which of the following will be true about k in kNN in terms of variance
 When you increase the k the variance will increases
 When you decrease the k the variance will increases
 Can‟t say
 None of these
Correct option is B
 Which of the following option is true about kNN algorithm?
 It can be used for classification
 It can be used for regression
 It can be used in both classification and regression Answer
Correct option is C
 In kNN it is very likely to overfit due to the curse of dimensionality. Which of the following option would you consider to handle such problem? 1). Dimensionality Reduction 2). Feature selection
 1
 2
 1 and 2
 None of these
Correct option is C
 When you find noise in data which of the following option would you consider in k NN
 I will increase the value of k
 I will decrease the value of k
 Noise can not be dependent on value of k
 None of these
Correct option is A
 Which of the following will be true about k in kNN in terms of Bias?
 When you increase the k the bias will be increases
 When you decrease the k the bias will be increases
 Can‟t say
 None of these
Correct option is A
 What is used to mitigate overfitting in a test set?
 Overfitting set
 Training set
 Validation dataset
 Evaluation set
Correct option is C
 A radial basis function is a
 Activation function
 Weight
 Learning rate
 none
Correct option is A
 Mistake Bound is
 How many training examples are needed for learner to converge to a successful hypothesis.
 How much computational effort is needed for a learner to converge to a successful hypothesis
 How many training examples will the learner misclassify before conversing to a successful hypothesis
 None of these
Correct option is C
 All of the following are suitable problems for genetic algorithms EXCEPT
 dynamic process control
 pattern recognition with complex patterns
 simulation of biological models
 simple optimization with few variables
Correct option is D
 Adding more basis functions in a linear model… (Pick the most probably option)
 Decreases model bias
 Decreases estimation bias
 Decreases variance
 Doesn‟t affect bias and variance
Correct option is A
 Which of these are types of crossover
 Single point
 Two point
 Uniform
 All of these
Correct option is D
 A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Which of the following statement is true in following case?
 Feature F1 is an example of nominal
 Feature F1 is an example of ordinal
 It doesn‟t belong to any of the above category.
Correct option is B
 You observe the following while fitting a linear regression to the data: As you increase the amount of training data, the test error decreases and the training error increases. The train error is quite low (almost what you expect it to), while the test error is much higher than the train error. What do you think is the main reason behind this behaviour? Choose the most probable option.
 High variance
 High model bias
 High estimation bias
 None of the above Answer
Correct option is C
 Genetic algorithms are heuristic methods that do not guarantee an optimal solution to a problem
 TRUE
 FALSE
Correct option is A
 Which of the following statements about regularization is not correct?
 Using too large a value of lambda can cause your hypothesis to underfit the
 Using too large a value of lambda can cause your hypothesis to overfit the
 Using a very large value of lambda cannot hurt the performance of your hypothesis.
 None of the above
Correct option is A
 Consider the following: (a) Evolution (b) Selection (c) Reproduction (d) Mutation Which of the following are found in genetic algorithms?
 All
 a, b, c
 a, b
 b, d
Correct option is A
 Genetic Algorithm are a part of
 Evolutionary Computing
 inspired by Darwin’s theory about evolution – “survival of the fittest”
 are adaptive heuristic search algorithm based on the evolutionary ideas of natural selection and genetics
 All of the above
Correct option is D
 Genetic algorithms belong to the family of methods in the
 artificial intelligence area
 optimization
 complete enumeration family of methods
 Noncomputer based (human) solutions area
Correct option is A
 For a two player chess game, the environment encompasses the opponent
 True
 False
Correct option is A
 Which among the following is not a necessary feature of a reinforcement learning solution to a learning problem?
 exploration versus exploitation dilemma
 trial and error approach to learning
 learning based on rewards
 representation of the problem as a Markov Decision Process
Correct option is D
 Which of the following sentence is FALSE regarding reinforcement learning
 It relates inputs to
 It is used for
 It may be used for
 It discovers causal relationships.
Correct option is D
 The EM algorithm is guaranteed to never decrease the value of its objective function on any iteration
 TRUE
 FALSE Answer
Correct option is A
 Consider the following modification to the tictactoe game: at the end of game, a coin is tossed and the agent wins if a head appears regardless of whatever has happened in the game.Can reinforcement learning be used to learn an optimal policy of playing TicTacToe in this case?
 Yes
 No
Correct option is B
190. Out of the two repeated steps in EM algorithm, the step 2 is _
 the maximization step
 the minimization step
 the optimization step
 the normalization step
Correct option is A
 Suppose the reinforcement learning player was greedy, that is, it always played the move that brought it to the position that it rated the best. Might it learn to play better, or worse, than a non greedy player?
 Worse
 Better
Correct option is B
 A chess agent trained by using Reinforcement Learning can be trained by playing against a copy of the same
 True
 False
Correct option is A
 The EM iteration alternates between performing an expectation (E) step, which creates a function for the expectation of the loglikelihood evaluated using the current estimate for the parameters, and a maximization (M) step, which computes parameters maximizing the expected loglikelihood found on the E
 TRUE
 FALSE
Correct option is A
 Expectation–maximization (EM) algorithm is an
 Iterative
 Incremental
 None
Correct option is A
 Feature need to be identified by using Well Posed Learning Problem:
 Class of tasks
 Performance measure
 Training experience
 All of these
Correct option is D
 A computer program that learns to play checkers might improve its performance as:
 Measured by its ability to win at the class of tasks involving playing checkers
 Experience obtained by playing games against
 Both a & b
 None of these
Correct option is C
 Learning symbolic representations of concepts known as:
 Artificial Intelligence
 Machine Learning
 Both a & b
 None of these
Correct option is A
 The field of study that gives computers the capability to learn without being explicitly programmed
 Machine Learning
 Artificial Intelligence
 Deep Learning
 Both a & b
Correct option is A
 The autonomous acquisition of knowledge through the use of computer programs is called
 Artificial Intelligence
 Machine Learning
 Deep learning
 All of these
Correct option is B
 Learning that enables massive quantities of data is known as
 Artificial Intelligence
 Machine Learning
 Deep learning
 All of these
Correct option is B
 A different learning method does not include
 Memorization
 Analogy
 Deduction
 Introduction
Correct option is D
 Types of learning used in machine
 Supervised
 Unsupervised
 Reinforcement
 All of these
Correct option is D
 A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience
 Supervised learning problem
 Un Supervised learning problem
 Well posed learning problem
 All of these
Correct option is C
 Which of the following is a widely used and effective machine learning algorithm based on the idea of bagging?
 Decision Tree
 Regression
 Classification
 Random Forest
Correct option is D
 How many types are available in machine learning?
 1
 2
 3
 4
Correct option is C
 A model can learn based on the rewards it received for its previous action is known as:
 Supervised learning
 Unsupervised learning
 Reinforcement learning
 Concept learning
Correct option is C
 A subset of machine learning that involves systems that think and learn like humans using artificial neural networks.
 Artificial Intelligence
 Machine Learning
 Deep Learning
 All of these
Correct option is C
 A learning method in which a training data contains a small amount of labeled data and a large amount of unlabeled data is known as
 Supervised Learning
 Semi Supervised Learning
 Unsupervised Learning
 Reinforcement Learning
Correct option is C
 Methods used for the calibration in Supervised Learning
 Platt Calibration
 Isotonic Regression
 All of these
 None of above
Correct option is C
 The basic design issues for designing a learning
 Choosing the Training Experience
 Choosing the Target Function
 Choosing a Function Approximation Algorithm
 Estimating Training Values
 All of these
Correct option is E
 In Machine learning the module that must solve the given performance task is known as:
 Critic
 Generalizer
 Performance system
 All of these
Correct option is C
 A learning method that is used to solve a particular computational program, multiple models such as classifiers or experts are strategically generated and combined is called as
 Supervised Learning
 Semi Supervised Learning
 Unsupervised Learning
 Reinforcement Learning
 Ensemble learning
Correct option is E
 In a learning system the component that takes as takes input the current hypothesis (currently learned function) and outputs a new problem for the Performance System to explore.
 Critic
 Generalizer
 Performance system
 Experiment generator
 All of these
Correct option is D
 Learning method that is used to improve the classification, prediction, function approximation etc of a model
 Supervised Learning
 Semi Supervised Learning
 Unsupervised Learning
 Reinforcement Learning
 Ensemble learning
Correct option is E
 In a learning system the component that takes as input the history or trace of the game and produces as output a set of training examples of the target function is known as:
 Critic
 Generalizer
 Performance system
 All of these
Correct option is A
 The most common issue when using ML is
 Lack of skilled resources
 Inadequate Infrastructure
 Poor Data Quality
 None of these
Correct option is C
 How to ensure that your model is not over fitting
 Cross validation
 Regularization
 All of these
 None of these
Correct option is C
 A way to ensemble multiple classifications or regression
 Stacking
 Bagging
 Blending
 Boosting
Correct option is A
 How well a model is going to generalize in new environment is known as
 Data Quality
 Transparent
 Implementation
 None of these
Correct option is B
 Common classes of problems in machine learning is
 Classification
 Clustering
 Regression
 All of these
Correct option is D
 Which of the following is a widely used and effective machine learning algorithm based on the idea of bagging?
 Decision Tree
 Regression
 Classification
 Random Forest
Correct option is D
 Cost complexity pruning algorithm is used in?
 CART
 5
 ID3
 All of
Correct option is A
 Which one of these is not a tree based learner?
 CART
 5
 ID3
 Bayesian Classifier
Correct option is D
 Which one of these is a tree based learner?
 Rule based
 Bayesian Belief Network
 Bayesian classifier
 Random Forest
Correct option is D
 What is the approach of basic algorithm for decision tree induction?
 Greedy
 Top Down
 Procedural
 Step by Step
Correct option is A
 Which of the following classifications would best suit the student performance classification systems?
 If.thenanalysis
 Marketbasket analysis
 Regression analysis
 Cluster analysis
Correct option is A
 What are two steps of tree pruning work?
 Pessimistic pruning and Optimistic pruning
 Post pruning and Pre pruning
 Cost complexity pruning and time complexity pruning
 None of these
Correct option is B
 How will you counter overfitting in decision tree?
 By pruning the longer rules
 By creating new rules
 Both By pruning the longer rules‟ and „ By creating new rules‟
 None of Answer
Correct option is A
 Which of the following sentences are true?
 In prepruning a tree is ‘pruned’ by halting its construction early
 A pruning set of class labeled tuples is used to estimate cost
 The best pruned tree is the one that minimizes the number of encoding
 All of these
Correct option is D
 Which of the following is a disadvantage of decision trees?
 Factor analysis
 Decision trees are robust to outliers
 Decision trees are prone to be over fit
 None of the above
Correct option is C
 In which of the following scenario a gain ratio is preferred over Information Gain?
 When a categorical variable has very large number of category
 When a categorical variable has very small number of category
 Number of categories is the not the reason
 None of these
Correct option is A
 Major pruning techniques used in decision tree are
 Minimum error
 Smallest tree
 Both a & b
 None of these
Correct option is B
 What does the central limit theorem state?
 If the sample size increases sampling distribution must approach normal distribution
 If the sample size decreases then the sample distribution must approach normal distribution.
 If the sample size increases then the sampling distributions much approach an exponential
 If the sample size decreases then the sampling distributions much approach an exponential
Correct option is A
 The difference between the sample value expected and the estimates value of the parameter is called as?
 Bias
 Error
 Contradiction
 Difference
Correct option is A
 In which of the following types of sampling the information is carried out under the opinion of an expert?
 Quota sampling
 Convenience sampling
 Purposive sampling
 Judgment sampling
Correct option is D
 Which of the following is a subset of population?
 Distribution
 Sample
 Data
 Set
Correct option is B
 The sampling error is defined as?
 Difference between population and parameter
 Difference between sample and parameter
 Difference between population and sample
 Difference between parameter and sample
Correct option is C
 Machine learning is interested in the best hypothesis h from some space H, given observed training data D. Here best hypothesis means
 Most general hypothesis
 Most probable hypothesis
 Most specific hypothesis
 None of these
Correct option is B
 Practical difficulties with Bayesian Learning :
 Initial knowledge of many probabilities is required
 No consistent hypothesis
 Hypotheses make probabilistic predictions
 None of these
Correct option is A
 Bayes’ theorem states that the relationship between the probability of the hypothesis before getting the evidence P(H) and the probability of the hypothesis after getting the evidence P(H∣E) is
 [P(E∣H)P(H)] / P(E)
 [P(E∣H) P(E) ] / P(H)
 [P(E) P(H) ] / P(E∣H)
 None of these
Correct option is A
 A doctor knows that Cold causes fever 50% of the time. Prior probability of any patient having cold is 1/50,000. Prior probability of any patient having fever is 1/20. If a patient has fever, what is the probability he/she has cold?
 P(C/F)= 0.0003
 P(C/F)=0.0004
 P(C/F)= 0.0002
 P(C/F)=0.0045
Correct option is C
 Which of the following will be true about k in KNearest Neighbor in terms of Bias?
 When you increase the k the bias will be increases
 When you decrease the k the bias will be increases
 Can‟t say
 None of these
Correct option is A
 When you find noise in data which of the following option would you consider in K Nearest Neighbor?
 I will increase the value of k
 I will decrease the value of k
 Noise cannot be dependent on value of k
 None of these
Correct option is A
 In KNearest Neighbor it is very likely to overfit due to the curse of dimensionality. Which of the following option would you consider to handle such problem?
 Dimensionality Reduction
 Feature selection
 1
 2
 1 and 2
 None of these
Correct option is C
 Radial basis functions is closely related to distanceweighted regression, but it is
 lazy learning
 eager learning
 concept learning
 none of these
Correct option is B
 Radial basis function networks provide a global approximation to the target function, represented by of many local kernel function.
 a series combination
 a linear combination
 a parallel combination
 a non linear combination
Correct option is B
 The most significant phase in a genetic algorithm is
 Crossover
 Mutation
 Selection
 Fitness function
Correct option is A
 The crossover operator produces two new offspring from
 Two parent strings, by copying selected bits from each parent
 One parent strings, by copying selected bits from selected parent
 Two parent strings, by copying selected bits from one parent
 None of these
Correct option is A
 Mathematically characterize the evolution over time of the population within a GA based on the concept of
 Schema
 Crossover
 Don‟t care
 Fitness function
Correct option is A
 In genetic algorithm process of selecting parents which mate and recombine to create offsprings for the next generation is known as:
 Tournament selection
 Rank selection
 Fitness sharing
 Parent selection
Correct option is D
 Crossover operations are performed in genetic programming by replacing
 Randomly chosen sub tree of one parent program by a sub tree from the other parent program.
 Randomly chosen root node tree of one parent program by a sub tree from the other parent program
 Randomly chosen root node tree of one parent program by a root node tree from the other parent program
 None of these
Correct option is A