LaDissertation.com - Dissertations, fiches de lectures, exemples du BAC
Recherche

Recherche fondamentale

Étude de cas : Recherche fondamentale. Recherche parmi 300 000+ dissertations

Par   •  23 Mars 2015  •  Étude de cas  •  1 383 Mots (6 Pages)  •  599 Vues

Page 1 sur 6

Chapter 2

Background Research

2.1 Introduction

In this chapter, presentation of the performed Background Research will be given. This research was done to gain a greater understanding of the project and to become better acquainted with the technologies that were used. There are 3 main areas of research took place:

• Forum Summarisation Service

This is the Forum Summarisation Service to which the Word Mining Function, which is being ported to the SoC Cloud Test Bed, belongs, and is introduced in section 2.2.

• Cloud Computing

As the Word Mining Function was being ported to a Cloud Environment, research was per- formed to gain a greater understanding of this up-and-coming computing paradigm in section 2.3. In section 2.4, the SoC Cloud Test Bed setup is presented as this is the Cloud Environment which is being used for experimentation.

• Hadoop MapReduce

The Word Mining Function which is being ported and tested on the SoC Cloud Test Bed has been programmed using the Hadoop MapReduce framework. Due to this, research had to be performed to understand how the Word Mining Function operates. This is given in section 2.5.

The research into Cloud Data Collection and Benchmarking Techniques is also presented in section 2.6, however little could be found which would provide assistance. Therefore Chapter 3 explains the techniques used in greater detail.

7

2.2 Forum Summarisation Service

In recent times there has been a huge growth in the Social Web, for example wikis and discussion fo- rums. Due to this, information overload is becoming a real issue and it is becoming a time consuming task to make sense of which posts are relevant for users to participate in. This ’Big Data’, as it has come to be known can be an overload to users and can cause great stress [4].

Humans use sense making, the process of understanding information in a given situation, to create a representation of the data to help achieve the goal of finding the correct information. A method that automates sense making has been developed as part of the EU DICODE project, which helps the user by producing ”Topic Clouds” (see Figure 2.1) to help the user make sense of the information available [4].

Figure 2.1: Topic Cloud Example [http://www.carveconsulting.com/wp/wp- content/uploads/2009/12/carvecloud.jpg]

The method has been developed using a single Machine and the Service is essentially split into two functions [4]:

• Word Mining Function

This uses the processing power of Hadoop MapReduce to produce a list and a count of all the words for a given input.

• Clustering Function

This uses the data mining capabilities of Mahout to make sense of the data.

However, the size of the input data is increasing to the point where a single machine cannot cope with the demands. Therefore a Cloud based solution is being sought after for the Word Mining Function of the Forum Summarisation Service, to aid with user sense making.

8

2.3 Cloud Computing

”Clouds are a large pool of easily usable and accessible virtualised resources. These resources can be dynamically reconfigured to adjust to a variable load (scale), allowing also for an optimum resource utilisation. This pool of resources is typically exploited by a pay-per-use model in which guarantees are offered by the infrastructure provider by means of customised Service Level Agreements” [21]

Water, electricity and heating are just some of the on demand services society provides us with today. These services are constantly available and employ a pay-per-use approach. Cloud computing is the equivalent in Computer Science[16]. Cloud Computing is the next development of computing models after Distributed Computing, Parallel Processing and Grid Computing [12]; it is hinting at a future where computations won’t occur on local machines, but on large centralised facilities that are being operated by third-party operators such as Amazon with their Elastic Compute Cloud (EC2) [9]. These centralised facilities consist of hundreds/thousands of commodity machines that all operate and work in parallel. For example, Google cloud computing is built with a large number of low-cost x86 server clusters [12].

Cloud computing differs from the traditional distributed computing paradigm in that: [9]

• It is scalable.

• Abstract entities can be created which encapsulate the needs of Cloud customers.

• The services provided can be dynamically configured, due to virtualization, and delivered

...

Télécharger au format  txt (9 Kb)   pdf (113.6 Kb)   docx (11.7 Kb)  
Voir 5 pages de plus »
Uniquement disponible sur LaDissertation.com