1…Goals of the Lecture

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (12.94 MB, 298 trang )

2

1 Introduction

Resource Planning (ERP) systems were rather dry with no intersections to modern

technologies as used by Google, Twitter, Facebook, and several others.

The team decided to start a new radical approach for ERP systems. To start

from scratch, the particular enabling technologies and possibilities of upcoming

computer systems had to be identified. With this foundation, they designed a

completely new system based on two major trends in hardware technologies:

• Massively parallel systems with an increasing number of Central Processing

Units (CPUs) and CPU-cores

• Increasing main memory volumes

To leverage the parallelism of modern hardware, substantial changes had to be

made. Current systems were already parallel in respective to their ability to handle

thousands of concurrent users. However, the underlying applications were not

exploiting parallelism.

Exploiting hardware parallelism is difficult. Hennessy et al. [PH12] discuss

what changes have to be made to make an application run in parallel, and explain

why it is often very hard to change sequential applications to use multiple cores

efficiently.

For the first prototypes, the team decided to look more closely into accounting

systems. In 2006, computers were not yet capable of keeping big companies’ data

completely in memory. So, the decision was made to concentrate on rather small

companies in the first place. It was clear that the progress in hardware development

would continue and that the advances will automatically enable the systems to

keep bigger volumes of data in memory.

Another important design decision was the complete removal of materialized

aggregates. In 2006, ERP systems were highly depending on pre-computed

aggregates. With the computing power of upcoming systems, the new design was

not only capable of increasing the granularity of aggregates, but of completely

removing them.

As the new system keeps every bit of the processed information in memory,

disks are only used for archiving, backup, and recovery. The primary persistence is

the Dynamic Random Access Memory (DRAM), which is accomplished by

increased capacities and data compression.

To evaluate the new approach, several bachelor projects and master projects

implemented new applications using in-memory database technology over the next

several years. Ongoing research focuses on the most promising findings of these

projects as well as completely new approaches to enterprise computing with an

enhanced user experience in mind.

1.3 Learning Map

The learning map (see Fig. 1.1) gives a brief overview over the parts of the

learning material and the respective chapters in these parts. In this graph, you can

easily see what the prerequisites for a chapter are and which contents will follow.

1.4 Self Test Questions

3

Fig. 1.1 Learning map

1.4

Self Test Questions

1. Rely on Disks

Does an in-memory database still rely on disks?

(a) Yes, because disk is faster than main memory when doing complex

calculations

(b) No, data is kept in main memory only

(c) Yes, because some operations can only be performed on disk

(d) Yes, for archiving, backup, and recovery

References

[BMK09] P.A. Boncz, S. Manegold, M.L. Kersten, Database Architecture Evolution: Mammals

Flourished long before Dinosaurs became Extinct. PVLDB 2(2), 1648–1653 (2009)

[KNF+12] A. Kemper, T. Neumann, F. Funke, V. Leis, H. Mühe, Hyper: adapting columnar

main-memory data management for transactional and query processing. IEEE Data

Eng. Bull. 35(1), 46–51 (2012)

[PH12]

D.A. Patterson, J.L. Hennessy, in Computer Organization and Design—The Hardware

/ Software Interface, (Revised 4th edn.). The Morgan Kaufmann Series in Computer

Architecture and Design (Academic Press, San Francisco, CA, USA, 2012)

[Pla09]

H. Plattner, in A common database approach for OLTP and OLAP using an inmemory column database, ed. by U. Çetintemel, S. Zdonik, D. Kossmann. SIGMOD

Conference (ACM, Newyork, 2009), pp. 1–2

Part I

The Future of Enterprise Computing

Chapter 2

New Requirements for Enterprise

Computing

When thinking about developing a completely new database management system

for enterprise computing, the question whether there is a need for a new database

management system arises. And the answer is yes! Modern companies have

changed dramatically. Nowadays companies are more data-driven than ever

before. For example, during manufacturing a much higher amount of data is

produced, e.g. by assembly line sensors or manufacturing robots. Furthermore,

companies process data at a much larger scale, e.g. competitor behavior, price

trends, etc. to support management decisions. And data volumes will continue to

grow in the future. There are two major requirements for a modern database

management system:

• Data from various sources have to be combined in a single database management system, and

• This data has to be analyzed in real-time to support interactive decision taking.

The following sections outline use cases for modern enterprises and derive

associated requirements for a completely new enterprise data management system.

2.1 Processing of Event Data

Event data influences enterprises today more and more. Event data is characterized

by the following aspects:

• Each event dataset itself is small (some bytes or kilobytes) compared to the size

of traditional enterprise data, such as all data contained in a single sales order,

and

• The number of generated events for a specific entity is high compared to the

amount of entities, e.g. hundreds or thousand events are generated for a single

product.

In the following, use cases of event data in modern enterprises are outlined.

H. Plattner, A Course in In-Memory Data Management,

DOI: 10.1007/978-3-642-36524-9_2, Ó Springer-Verlag Berlin Heidelberg 2013

7

8

2 New Requirements for Enterprise Computing

2.1.1 Sensor Data

Sensors are used to supervise the function of more and more systems today. One

example is the tracking and tracing of sensitive goods, such as pharmaceuticals,

clothes, or spare parts. Hereby packages are equipped with Radio-Frequency

Identification (RFID) tags or two-dimensional bar codes, the so-called data matrix.

Each product is virtually represented by an Electronic Product Code (EPC), which

describes the manufacturer of a product, the product category, and a unique serial

number. As a result, each product can be uniquely identified by its EPC code. In

contrast, traditional one-dimensional bar codes can only be used for identification

of classes of products due to their limited domain set. Once a product passes

through a reader gate, a reading event is captured. The reading event consists of

the current reading location, timestamp, the current business step, e.g. receiving,

unpacking, repacking or shipping, and further related details. All events are stored

in decentralized event repositories.

Real-Time Tracking of Pharmaceuticals

For example, approx. 15 billion prescription-based pharmaceuticals are produced

in Europe. Tracking any of them results in approx. 8,000 read event notifications

per second. These events build the basis for anti-counterfeiting techniques. For

example, the route of a specific pharmaceutical can be reconstructed by analyzing

all relevant reading events. The in-memory technology enables tracing of 10

billion events in less than 100 ms.

Formula One Racing Cars

Formula one racing cars are also generating excessive sensor data. These sports cars

are equipped with up to 600 individual sensors, each recording tens to hundreds of

events per second. Capturing sensor data for a 2 h race produces giga- or even

terabytes of sensor data depending on their granularity. The challenge is to capture,

process, and analyze the acquired data during the race to optimize the car parameters instantly, e.g. to detect part faults, optimize fuel consumption or top speed.

2.1.2 Analysis of Game Events

Personalized content in online games is a success factor for the gaming industry.

The German company Bigpoint is a provider of browser games with more than 200

million active users.1 Their browser games generate a steady stream of more than

1

Bigpoint GmbH—http://www.bigpoint.net/

Xem Thêm

1…Goals of the Lecture

Tài liệu liên quan

Tài liệu bạn tìm kiếm đã sẵn sàng tải về