Verified Document

Replication Today, There Are A Research Paper

The queries are then translated so that they are actually run against the local data using local names in the local query language; in the reverse direction results may be scaled, if needed, to take account of a change of measurement units or character codes (Applehans, et al. 2004). The technological test of these systems is to create programs with the intelligence necessary to divide queries into sub-queries to be interpreted and sent to local databases, and after that to merge all the results that come back. Great progress has been made in methods for setting up efficient dispersed query execution and the constituent that does this is frequently called an intermediary (Pacitti & Simon, 2000).With reference to the points previously listed: Space. No extra space is needed locally, apart from a temporary cache for results retrieved from remote sites.

Updates. For the reason that a single replica of the data is worn with no local mirroring, all revisions to the remote component databases are instantly available. The presently update programs can carry on running, using the local names and storage arrangements and indexes. If in its place the data were to be transferred into some centralized format on a central computer, there would be a huge amount of work required to redraft the revise programs.

Autonomy. A multi-database architecture does not influence other clients of the constituent data resources who could, if they wanted, carry on using these precisely as before. In addition, we can take advantage of modified software tools by transferring requests to these from the intermediary. One benefits of this is that the local query language can gain benefits from the indexing systems that are locally obtainable.

Consequently there is no need to bring in large data sets from an array of servers. Nor is it essential to change all data for use with solitary physical storage architecture. On the other hand, additional effort is required to attain a mapping from the constituent databases onto the conceptual replica. The appropriateness of a coalesce multi-database approach for incorporating biological databases is backed by Leonard, (2007) and also projected by Aubrey & . Cohen (1996).

An Example Multi-Database System

In our current work, we are using the P/FDM database management system (Angoss Software, 2006), which is based on a powerful shared Functional Data Model (FDM; Shipman, 1981), to provide access to data held in different physical formats and at different sites. The FDM and its query language, Daplex (similar to OQL), arose from the MULTIBASE project (Gray, et al. 2005) which was an early project in integrating distributed heterogeneous database systems. Another feature of FDM is that both stored and derived data are created in a consistent way, through purpose (therefore the name useful data model). This suppleness allows us to obtain data during calls to remote databases.

Our main use of this database has been to support three-dimensional structural analysis and protein modeling (Sandler, 1994), and we have extended our initial general protein structure database to enable specialized techniques to be developed for modeling antibodies (Applehans, et. al, 2004). A sturdy semantic data mold like the FDM offers data independence, and we have tested more than a few alternative physical storage configurations, as well as hash files and relational tables (Gray & Watson, 1998).

Because this model uses object identifiers it is also potentially useful for federated access to the newer object databases that use object storage techniques (Kemme & Alonso, 2000) and with hybrid Object-Relational databases. These latter have the advantage of storing many special data types such as images and sound, possibly in huge volumes, which can be cross referenced from the usual relational tables of numerical and character data (Sandler, 1994).

FIG. 1.2: A Daplex query may be interpreted into a prolog query to contact data held locally or SRS code to gain right of entry data at EBI. However, some Daplex Queries will need both local and remote data and so will be interpreted into a Combination of Prolog and SRS code.

Our sample federated structure (Applehans, et. al, 2004) enters biological databanks stored at the European Bioinformatics Institute (EBI). These databanks contain formatted flat files, and a classification called the sequence retrieval system (SRS) retains cross-references between connected entries in dissimilar databanks held as directories in different tables. SRS also gives a command line crossing point that provides support for simple data selection...

Our prototype system uses a description of the EBI databanks that maps these onto entities, relationships and attributes in an FDM schema. Queries submitted by the user are analyzed and partitioned automatically into parts that refer to data held locally and to data held at the EBI. Code generators construct data access requests to retrieve those data values from the local databases, and SRS code is produced and sent to the EBI for execution. This process is illustrated in Fig. 1.1 shows in detail the steps in processing a query that relates structural data in a local antibody database and data held at the EBI. Our P/FDM system is mainly implemented in the logic language Prolog, and query analysis and code generation is executed easily using Prolog's powerful pattern matching abilities. The particulars of this are, of course, hidden from the end user, who uses the Daplex language or a graphical interface that produces it without human intervention.
Tight and Loose Coupling

Early Computing Science research concentrated on distributed database systems that were tightly coupled together and accessible through an integrated data model known as a global schema. This meant that the integrated model was designed before designing the schemas of the individual databases, which were dependent on it. This was done partly for performance and partly in order to guarantee consistency of global updates of pieces of linked data. Thus this model has tended to be used within large companies such as banks, but not across autonomous sites (Gray & Watson, 1998). The execution made the shared data model very rigid, so that when local databases had to develop and add additional tables in dissimilar representations there was no way for this to be done and held in the shared model, and consequently the superfluous information was not accessible by the client computers. Certainly, some observers became very pessimistic about global schema integration, and rejected it as impractical and requiring too much strong central management (Aubrey & Cohen, 1996).

These pessimistic views began to change with the enormous success of the World Wide Web, which has shown that loosely coupled systems with only local updates can be remarkably flexible and effective. In this case users have been very willing to adjust their exported data to conform to a common syntax for marked up document (HTML) and a common protocol for exchanging messages (HTTP). The question is whether scientists with information to exchange will go one step further and use a shared data model, because the problem with WWW is that the information is now exchanged largely in natural language that computers can transport but not understand! Thus it is much harder for computers to process answers from autonomous web sites and be sure that the questions asked were interpreted consistently at each site.

At the pinnacle of the spectrum of choices for data integration we have tight-coupling alternatives that, as we have learnt, are too preventive. At the base we have an accord exclusively on syntax, which communicates approximately to the use of HTML. So as to get the preferred coalesce information infrastructure we believe, that we do not need the implementation of an ordinary hardware platform or vendor DBMS, but we do require a shared data model across contributing sites.

References

Angoss Software. (2006) KnowledgeSEEKER in Action: Case Studies. Toronto: Angoss Software International Limited.

Applehans, W., a. Globe, and G. Laugero. (2004) Managing Knowledge: A Practical Web-Based Approach. Reading, MA: Addison-Wesley.

Aubrey, a., and P.M. Cohen. (1996) Working Wisdom: Timeless Skills and Vanguard Strategies for Learning Organizations. San Francisco: Jossey-Bass.

Gray, J. Helland, P. O'Neil, P. And D. Shasha, (2005) the dangers of replication and a solution, Proc. ACM SIGMOD Conference, pages 173-182.

Gray, P., and H. Watson. (1998) Decision Support in the Data Warehouse. Upper Saddle River, NJ: Prentice-Hall.

Kemme B. And Alonso, G. (2000) a new approach to developing and implementing eager database replication protocols, ACM Trans. Database Systems.

Pacitti E. And Simon, E. (2000) Update propagation strategies to improve freshness in lazy master replicated databases, VLDB Journal, 8(3-4): 305-318.

Leonard, D. (2007) Wellsprings of Knowledge: Building and Sustaining the Sources of Information. Boston: Harvard Business School Press.

Sandler, B-Z. (1994) Computer-Aided Creativity: A Guide for Engineers, Managers, and Inventors. New York: Van Nostrand Reinhold.

Sources used in this document:
References

Angoss Software. (2006) KnowledgeSEEKER in Action: Case Studies. Toronto: Angoss Software International Limited.

Applehans, W., a. Globe, and G. Laugero. (2004) Managing Knowledge: A Practical Web-Based Approach. Reading, MA: Addison-Wesley.

Aubrey, a., and P.M. Cohen. (1996) Working Wisdom: Timeless Skills and Vanguard Strategies for Learning Organizations. San Francisco: Jossey-Bass.

Gray, J. Helland, P. O'Neil, P. And D. Shasha, (2005) the dangers of replication and a solution, Proc. ACM SIGMOD Conference, pages 173-182.
Cite this Document:
Copy Bibliography Citation

Related Documents

Cloning Today Man Has Progressed
Words: 3309 Length: 12 Document Type: Term Paper

"Animals that are experiencing dwindling numbers could be cloned to prevent their extinction. Taiwanese scientists claimed to have made five clones of an endangered pig to save this species" (Anonymous). While some say man should not play God there are others like Edmund Erde who disagree and say that "playing God" is a phrase that is "muddle-headed" and "nonsensical" and should be deserted (Edmund Erde, p.594). For those who

HRM Challenges in Today's Organizations All Organizations
Words: 9712 Length: 25 Document Type: Research Paper

HRM Challenges in Today's Organizations All organizations require employees to make them a success and this function is considered as important as finance, machinery and land for running the organization successfully. The important point to note here is that individuals all have different temperaments and working methods, and some people in the organization are responsible for making them all work together. This is the job of the human resources department which

Video Surveillance in Today's Highly
Words: 2715 Length: 8 Document Type: Term Paper

Studies done by the United States Defense Department have discovered the technology to be correct only fifty-four percent of the time. Furthermore, the study found that the systems could easily be compromised by alterations in weight, hair color, sunglasses, and even weather and lighting alterations (McCullagh & Zarate, 2002). Additionally, behavioral recognition software can often incorrectly identify movements, such as tree branches, and follow those objects instead of actual

Ethics in Marketing There Are
Words: 1329 Length: 5 Document Type: Research Paper

However, there are often no rules, or weak ones, and that can create an ethical dilemma on the part of marketers. They must decide for themselves what lines they want to cross, and set their own codes of ethics. Where rules only provide guidelines, these can be open to interpretation. As we have seen with the area of marketing to children, however, the marketing industry has kept ahead of the

DNA the Structure and Nature of DNA
Words: 1597 Length: 5 Document Type: Term Paper

DNA The Structure and Nature of DNA DNA, or deoxyribonucleic acid, is the basic system upon which life on Earth is constructed. In a very real sense, DNA is a kind of program for life that cells use to replicate themselves and transmit information from generation to generation. Over eons, as life changes and adapts to new environmental conditions, that information is stored in the genetic code of all life on the

Hepatitis C Virus
Words: 3154 Length: 10 Document Type: Term Paper

Hepatitis C What is the leading cause of liver disease? What could cause so many people to require liver transplants? Most people on the street today would think that the answer to those questions would be alcoholism. And, although alcohol does do its fair share of damage to livers around the world, there is a greater source causing chronic liver disease out there. This term paper will attempt to shed light on

Sign Up for Unlimited Study Help

Our semester plans gives you unlimited, unrestricted access to our entire library of resources —writing tools, guides, example essays, tutorials, class notes, and more.

Get Started Now