Verified Document

Bioinformatics Machine Learning Snp Mutation Dissertation

Machine Learning Method in Bioinformatics Bioinformatics involves an integrated approach involving the use of information technology, computer science to biology and medicine as professional and knowledge fields. It encompasses the knowledge associated with information systems, artificial intelligence, databases, and algorithms, soft computing, software engineering, image processing, modeling and simulation, data mining, signal processing, computation theory and information, system an d control theory, discrete mathematics, statistics and circuit theory. On the other hand, machine learning entails a sub-division of artificial intelligence and operates with technical skills to permit computers to adapt to certain responses and initiate actions (Zhang et al., 2009). Machine learning entails a range technical knowledge that looks at the scientific application of search engines, natural language processing, bioinformatics, medical diagnosis and cheminformatics, analysis of the stock market, game playing and computer vision.

The development of machine learning has been as a matter of necessity given the fact that, current knowledge that needs some levels of sophistication and technological advancement has been on the consistent growth. These are included in the revolution in the genomic field that entails amino acid sequencing and nucleotide sequences. To accomplish the possibility of storing these essential informati0on, machine learning has been imperative has led t-o the building of several sophisticated interfaces that researchers can manipulate to establish access to the available databases. In general, it is evident that the abilities of computers in learning these large numbers of application has not only provided solutions to a great deal of technological problems, but has also provided a prolific ground for knowledge acquisition (Fasconi & Nato, 2003).

Machine Learning Approaches

The process of machine learning involves the adoption of certain approaches that assist in .performing separate objectives or functions. These involve two most applied learning scenarios with their distinct criterion functions (Zhang et al., 2009). These two approaches are commonly referred to as supervised and unsupervised selection approaches/criteria.

Supervised approach

This may also be intimated to as discrimination or prediction classification. In this approach, algorithms are developed to levels priori-defined. The construction of algorithm takes place in the dataset training followed by comprehensive tests on independent data set to examine the algorithmic accuracy and efficiency. In the process of regression and classification, a group of support vectors that are related to methods of supervised learning. Such related vector machines include among others linear classification, which develops a straight line providing a distinct boundary between two dimensions (zhang et al., 2009). These lines may also be referred to as hyper lines which have replaced the use of the dot product for reasons of fitting in the maximum-margin. A decision tree structure may also be applies whereby classifications are represented by the leaves while feature conjunctions that direct to the classification are represented by the branches. Decision tree algorithm may be efficiently changed into a paradigm of rules of production. The supervised appr5oach also entails the use of artificial Neural Networks, a group of nodes that are interconnected that process information through the use of computational model. The information that flows through the network whether external or internal may change ANN's structure. The relationship that exists between inputs and outputs can be modeled by the use of ANN. Multi-Layer Perception (MLP) and the Radial basis function (RBF) are the most used algorithms of the ANN.

Unsupervised classification

It involves two distinct ways applicable in designing of selection criteria. They are identified on their metric of performance illustrated as classification driven criterio0n and fidelity driven criterion. Fidelity driven criterion is dependent upon the bulk of the, original information stored or discarded after the reduction of the feature dimension. Unsupervised approach operates on the basis of cluster analysis in which the method of clustering separates objects, into a number of predetermined groups assuming a pattern...

Parts of this document are hidden

View Full Document
svg-one

That of the biological concept that has links with the nervous systems kin the neuroscience. The second describes interconnecting artificial neuron networks built on the principles of the biological neurons. In concept classification, multi-layer perception presents an instrumental method for such tasks. Multi-layer perception denotes a feed forward neural network which has a single or several layers that are found between input and output layers. This explains the flow of data in a unidirectional form moving from the input to the output layer. The back propagation algorithm of learning enables the training of this network. The use of multi-layer perception is applied in varied ranges of patterns for classification, prediction, recognition, and approximation. Solving linear problems may cause proble3ms using other means or perceptions but when this is applied the linear problems are easily solved.
The back propagation is used in the training of the multi-layer perception to enable it to accurately describe feed forward network. The training of the network may be carried in either of the types of the network training referred to as sequential mode, which is online, per pattern or stochastic, and batch mode which is offline, or per-epoch. The sequential mode has very limited storage for every connection weighted, a presentation order which is very random and per pattern means of updating indicating weight space search as stochastic hence low local minimal risk. It also has the ability to capitalize on any redundancy aspect during the training set; its implementation is also very simple. When the batch mode is used, there is enhanced high learning speed as compared to the sequential mode and it is very simple to parallelize. However, the use of multi-layer perception with activation functions that are not linear present complicated surfaces of error void of any minimum (Gurney, 2003)

Random forests

The term random forest is used to describe an ensemble of learning that comprises a bagging of a decision tree that has not been modified or not pruned that also exhibit a randomized identification of features in every split. Decision tree refer to combined individual learners and is popularly adopted in the exploration of data. An example of decision tree is known as CART (classification and regression tree). The random forest integrates the idea of feature selection and bagging which is used in the construction of a group of decision trees with a supervised variation. The random forest is useful in that it builds a learning algorithm that is precise and accurate hence giving a classifier which is accurate, when used on very large or expansive databases it passes over efficiently. A large number of input variables may be used with no deletion of any variable, provides an estimation of the type of variables that may be used in the classification. It also has the ability to initiate internal indiscriminate estimation of the generalization error during the continuation of forest building.it is easy to estimate the data which is missing and retains accuracy should there be a large chunk of missing data, in class population unbalanced data sets, random forests provide effective methods of error balancing. The relationship between classification and the variables are possible to identify easily due to the ability to compute prototypes for the sake of deriving such information. It is easy to cluster, locate outliers and provide vital data information due to the possibilities of proximity computation between pairs of cases. Random forest gives an experimental method that can be used in the identification of interactions between variables.

On the other hand, random forests sometimes over fit in certain datasets that shows regression functionality and noisy classification. It is also clear that humans experience a lot of difficulties in the interpretation of the random forests classification. Of importance to note is the fact that a data set of 200 random forests has been developed so that an intuitive visualization of a model space could be developed.

Data sampling

It refers to selection of a particular group or population for a particular study. A process of experiments, direct observation,…

Sources used in this document:
References

1. Pavlov, Y.L. (2000). Random forests. Utrecht: VSP.

2. Maindonald, J.H., & Braun, J. (2010). Data analysis and graphics using R: An example-based approach. Cambridge: Cambridge University Press.

3. Gurney, K. (2003). An introduction to neural networks. Boca Raton, FL: CRC Press.

4. Zhang, Y., & Rajapakse, J.C. (2009). Machine learning in bioinformatics. New York: John Wiley & Sons.
Cite this Document:
Copy Bibliography Citation

Related Documents

Technology in Film Fred Ott Was the
Words: 1898 Length: 6 Document Type: Essay

Technology in Film Fred Ott was the very first movie star that every existed. His brief starring role in the five-second film that showed him sneezing started the use of technology to make films. Since then, technology both in and out of film has changed immeasurably and what technology is used and is said about technology in these same films has evolved quite a lot and the statements sometimes made are

Technology Proposal Social Networking and
Words: 1070 Length: 4 Document Type: Term Paper

Creating an effective library social networking platform needs to start with a view of how to create a content management system (CMS) first (Dickson, Holley, 2010). The CMS serves as the system of record of all catalog and collection items, providing valuable statistics on how each book, content and collection item is being used (Shapira, Zabar, 2011). This is extremely valuable information for each department in the library to

Technology Acceptance Model Tam and Information Systems Success...
Words: 4229 Length: 15 Document Type: Literature Review

Technology Acceptance Model Using Technology Acceptance Model (TAM) to Assess User Intentions and Satisfaction on Software as a Service (SaaS): The Value of SaaS Software as a Service (SaaS) was researched by Benlian and Hess (2011) in an effort to determine its value to companies. Among the arguments was that SaaS is already declining in popularity even though it is very new. The majority of the arguments that lean in that direction have

Technology for Students With Disabilities
Words: 3214 Length: 11 Document Type: Research Paper

These benefits arise because of implementing both assistive technologies and Information Communication technology (ICT). The implementation of technology in classrooms usually has benefits to both the disabled students as well as the teachers (Kirk, Gallagher, Coleman, & Anastasiow, 2012, p.240). The general benefits of use of assistive technologies and ICT in teaching students with learning disabilities include greater learner autonomy and unlocking hidden potential with those with communication difficulties.

Technology in the Workplace the
Words: 2715 Length: 9 Document Type: Research Paper

Instead, organizations must strike a balance between the autonomy and independence offered by technology in the workplace and the need for employees to have some level of formal and informal commitment to the organization as a whole. Just as seemingly counterproductive activities such as doodling, daydreaming, or, in the information age, surfing the internet, can actually contribute to efficiency by offering employees productive mental stimulation, so too can the

Technology in the Classroom in Today's Society,
Words: 3260 Length: 10 Document Type: Research Paper

Technology in the Classroom In today's society, technology has become an accepted medium for communication. From email correspondence that has taken the place of mail, to texting instead of talking, advances in technology have become integrated into our daily lives. However, the line should be drawn when it comes to technology impeding academics and being incorporated into the classroom. Although many support the notion of technology in the class, others see

Sign Up for Unlimited Study Help

Our semester plans gives you unlimited, unrestricted access to our entire library of resources —writing tools, guides, example essays, tutorials, class notes, and more.

Get Started Now