Verified Document

Integrating Heterogeneous Data Using Web Services Literature Review

IEEE-Computer Science -- Literature Review IEEE-Computer Science

Integration Approaches Practices

The work of Ziegler and Dittrich (nd) reports that integration is "becoming more and more indispensable in order not to drown in data while starving for information." The goal of data integration is "to combine data from different sources by applying global data model and by detecting and resolving schema and data conflicts so that a homogenous, unified view can be provided." (Ziegler and Dittrich, nd) There are two reasons for data integration:

(1) Given a set of existing data sources, an integrated view is created to facilitate data access and reuse through a single data access point;

(2) Given a certain information need, data from different complementing sources is to be combined to gain a more comprehensive basis to satisfy the information need. (Ziegler and Dittrich, nd)

Foundations of the SIRUP Approach

Foundations of the SIRUP approach are stated to include the following principles:

(1) Semantic Perspectives -- "a user defined conceptual model of an application domain with explicit queryable semantics for all entities and relationships appearing in it."

(2) Bipartite Integration Process -- generally two primary roles: data providers and data users. It is reported that there are two distinct phrases in the integration process of the SIRUP approach: (a) a data provision phase where administrators of local data sources explicitly declare the data and its semantics that is offered for integration; and (b) a Semantic Perspective modeling phase where users who know their application domain for which data is to be integrated define the desired Semantic Perspective.

(3) IConcepts -- An IConcept is short for Intermediate Concept and is a basic conceptual building block that acts as a linking element between data providers and data users interested in data for their information needs. Each IConcept has a queryable link to at least one concept of an ontology to explicitly define the semantics of the real-world concept it represents. Data sources are stated to provide attributes for an ontological concept represented by a particular IConcept. Through this, the data sources are able to declare which attribute data they are capable and willing to provide concerning a given IConcept. For each of the attributes it is reported that additional structural metadata is provided. (Ziegler and Dittrich, nd, paraphrased) IConcept provide data providers with a way to specifically identify the semantics and structure of the data offered for integration that is user-specific. IConcept is for data users "an access point to retrieve data from different data sources referring to the same real-world concept." (Ziegler and Dittrich, nd) IConcepts are additionally reported to conceal technical and structural heterogeneity from data users and assist in resolving semantic conflicts according to the perception of the user of the application domain.

(4) User Concepts -- a user-specific concept that is built through selection and combination of user specific copies of IConcepts.

(5) Semantic Multidatasource Language -- a declarative language is provided for provision of data in addition to specification of User Concepts and Semantic Perspectives. This language is reported to provide support for querying of explicit semantics and metadata assigned to User Concepts and IConcepts.

(6) Ex-ante View Definition -- users can specify views only on top of already existing schemas and this approach is referred to as 'ex-post view definition' because the view is "created after a schema is defined." (Ziegler and Dittrich, nd)

(7) Pragmatic Data Integration -- approaches that integrate data against one or more global ontologies and assume an ideal world in which data for all ontology concepts is available. (Ziegler and Dittrich, nd)

Model Management

The work of Bernstein, Halevy, and Pottinger (nd) entitled "A Vision for Management of Complex Models" reports on the challenges that are met in the construction of applications for database systems (DBMSs) and how this is inclusive of "the manipulation of models." Models are described as "a complex discrete structure that represents a design artifact, such as an SML DTD, web-site schema, interface definition, relational scheme, database transformation script, workflow definition, semantic network, software configuration or complex document." (Bernstein, Halevy, and Pottinger, nd)

The use of models is inclusive of management of the changes that take place in models and the data transformation from one to the other, which is reported to make a requirement of "an explicit representations of mappings between models." (Bernstein, Halevy, and Pottinger, nd) It is the belief of Bernstein, Halevy, and Pottinger that the DBMS could be made easier to use through "making 'model' and 'mapping' first-class objects with high-level operations that simplify their use…" which is referred to as "model management." (Bernstein, Halevy, and...

(nd)
According to Bernstein, Halevy, and Pottinger (nd) the data model is comprised by "formal structures for representing models and mappings between models and of algebraic operations on those structures. " Model management applications presently while being functionally advanced through relational and OO DBMSs "still include a lot of complex code for navigating graph-like structures. Producing, understanding, tuning, and maintaining navigational code is a serious drag on programmer productivity, making model management applications expensive to build." (Bernstein, Halevy, and Pottinger, nd)

Proposed by Bernstein, Halevy, and Pottinger is to raise the "level of abstraction beyond current DBMSs through introduction of "high levels operations on models and model mappings." (Bernstein, Halevy, and Pottinger, nd) Examples are "matching, merging, selection and composition" all of which are not particularly novel operations. (Bernstein, Halevy, and Pottinger, nd) The following model examples and mappings are stated to illustrate the "pervasiveness and scope of model management." (Bernstein, Halevy, and Pottinger, nd) Those are stated as follows:

(1) mapping an XML schema of one application to that of another in order to guide the exchange of XML instances between the applications;

(2) mapping a web site's content to its page layout in order to drive the generation of web pages;

(3) mapping data sources into data warehouse tables in order to generate programs that transform production data and load it into a data warehouse; mapping the DB schema of one software release into that of the next release, to guide the migration of DBs;

(4) mapping source make files into target make files in order to drive the transformation of make scripts and thereby help port complex applications from one programming environment to another; and (5) mapping the components of a complex application to the components of a system where it will be deployed in order to drive the generation of installation, upgrade, and de-installation programs. (Bernstein, Halevy, and Pottinger, nd)

Construction of generic functions in model creation and mappings enables them to be manipulated as single objects serving to create a better environment for the tasks just stated previously. The glue provided between the systems is reported to be provisioned by "simple adapters that:

(1) import or export a model in the model management system from or to a schema in the target platform; or (2) interpret a mapping in the model management system to transform instances of one target model to those of another." (Bernstein, Halevy, and Pottinger, nd) It is stated there are many challenges in identifying architectures that are sound for system coupling.

The leverage of building model management functionality is stated to be "highly generic…[and]…widely applicable." (Bernstein, Halevy, and Pottinger, nd) Model management applications are described as "metadata management" and it is stated that the primary effort in building such an application is "in manipulating descriptions of a thing of interest, rather than the thing itself." (Bernstein, Halevy, and Pottinger, nd) The question is posed as to whether keywords are actually data or if they are metadata and it is stated that model management "takes a different cut at the problem. It focuses attention on a particular kind of metadata, structure and mathematical semantics of descriptive information." (Bernstein, Halevy, and Pottinger, nd)

Stated to be a primary goal of model management is the provision of support for managing change in models and for mapping data between models. Therefore, it is believed that model mappings must be manipulated as first-class citizens. Key elements underlying the approach of Bernstein, Halevy, and Pottinger (nd) to model mappings include:

(1) the need to manipulate model mappings much as models are manipulated;

(2) mapping consists of connections between instances of two models, which are often different types;

(3) there may be more than one mapping between a given pair of models;

(4) a mapping may relate a set of objects in on model to a set of objects in another via a language for building complex expressions;

(5) mappings must be able to nest because this enables the reuse of mappings: a mapping on a model M. To be used a component of a mapping on models that contain M. (nd)

Databases to Dataspaces (Franklin, Halevy and Maier, 2005)

Franklin, Halevy and Maier (2005) write in the work entitled "From Databases to Dataspaces: A New Abstraction for Information Management" that a Database Management System (DBMS) is a generic repository…

Sources used in this document:
References

Bernstein, Philip A., Halevy, Alon Y., and Pottinger, Rachel A. (nd) A Vision for Management of Complex Models.

Franklin, Michael, Halevy, Alon, Maier, David (2005) From Databases to Dataspaces: A New Abstraction for Information Management. ACM SIGMOD December 2005.

Building Data Integration Systems: A Mass Collaboration Approach.

Chawathe, S. et al. (nd) The TSIMMIS Project: Integration of Heterogeneous Information Sources.
Fuxman, Ariel and Miller, Renee J. (2003) Towards Inconsistency Management in Data Integration Systems. Retrieved from: http://www.isi.edu/info-agents/workshops/ijcai03/papers/fuxman-ijcai3.pdf
Schmidt, John and Lyle, David (2010) Why Lean Integration is Important to Data Integration Systems. Search Data Management. Tech Target. Retrieved from: http://searchdatamanagement.techtarget.com/tutorial/Why-lean-integration-is-important-to-data-integration-systems
Cite this Document:
Copy Bibliography Citation

Related Documents

Integrating Heterogeneous Data Using Web Services
Words: 8823 Length: 30 Document Type: Methodology Chapter

solution of the heterogeneous data integration problem is presented with the explanation if the criteria to be employed in the approval of the validity. The tools to be used are also indicated. The proposed solution is to use semantic web technologies (Semantic Data Integration Middleware (SIM) Architecture) for the initial integration process (Cardoso,2007) and then couple it with broker architecture to improve integration and interoperability while solving the problem of

Data Mining Evaluating Data Mining
Words: 3527 Length: 10 Document Type: Thesis

The use of databases as the system of record is a common step across all data mining definitions and is critically important in creating a standardized set of query commands and data models for use. To the extent a system of record in a data mining application is stable and scalable is the extent to which a data mining application will be able to deliver the critical relationship data,

Service-Oriented Architectures in It Service-Oriented
Words: 2511 Length: 9 Document Type: Essay

Web Services in the context of an SOA framework are designed to be the catalyst of greater order accuracy and speed, further increasing performance of the entire company in the process. The collection of Web Services is meant to not replace the traditional and highly engrained ERP systems in a company; rather Web Services are meant to extend and enhance their performance and making them more agile over time

Wide Web Is Available Around
Words: 14250 Length: 52 Document Type: Term Paper

The reward for the effort of learning is access to a vocabulary that is shared by a very large population across all industries globally" (p. 214). Moreover, according to Bell, because UML is a language rather than a methodology, practitioners who are familiar with UML can join a project at any point from anywhere in the world and become productive right away. Therefore, Web applications that are built using

Branding New Service Dominant Logic
Words: 12522 Length: 50 Document Type: Dissertation

Branding in Service Markets Amp Aim And Objectives Themes for AMP Characteristics Composing Branding Concept Branding Evolution S-D Logic and Service Markets Branding Challenges in Service Markets Considerations for Effective Service Branding Categories and Themes Branding Theory Evolution S-D Logic and Service Markets Branding Challenges in Service Markets Considerations for Effective Service Branding Branding Concept Characteristics Characteristics Composing Branding Concept Sampling of Studies Reviewed Evolution of Branding Theory Evolution of Marketing Service-Brand-Relationship-Value Triangle Brand Identity, Position & Image Just as marketing increasingly influences most aspects of the consumer's lives, brands

Computer Clustering Involves the Use of Multiple
Words: 2319 Length: 8 Document Type: Term Paper

Computer clustering involves the use of multiple computers, typically personal computers (PCs) or UNIX workstations, multiple storage devices, and redundant interconnections, to form what appears to users as a single integrated system (Cluster computing). Clustering has been available since the 1980s when it was used in Digital Equipment Corp's VMS systems. Today, virtually all leading hardware and software companies including Microsoft, Sun Microsystems, Hewlett Packard and IBM offer clustering technology.

Sign Up for Unlimited Study Help

Our semester plans gives you unlimited, unrestricted access to our entire library of resources —writing tools, guides, example essays, tutorials, class notes, and more.

Get Started Now