1. Trang chủ >
  2. Công Nghệ Thông Tin >
  3. Cơ sở dữ liệu >

6…Production and Distribution Planning

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (12.94 MB, 298 trang )


2.6 Production and Distribution Planning



13



2.6.1 Production Planning

Production planning identifies the current demand for certain products and consequently adjusts the production rate. It analyzes several indicators, such as the

users’ historic buying behavior, upcoming promotions, stock levels at manufacturers and whole-sellers. Production planning algorithms are complex due to

required calculations, which are comparable to those found in BI systems. With an

in-memory database, these calculations are now performed directly on latest

transactional data. Thus, algorithms are more accurate with respect to current stock

levels or production issues, allowing faster reactions to unexpected incidents.



2.6.2 Available to Promise Check

The Available-to-Promise (ATP) check validates the availability of certain goods.

It analyzes whether the amount of sold and manufactured goods are in balance.

With raising numbers of products and sold goods, the complexity of the check

increases. In certain situations it can be advantageous to withdraw already agreed

goods from certain customers and reschedule them to customers with a higher

priority. ATP checks can also take additional data into account, e.g. fees for

delayed or canceled deliveries or costs for express delivery if the manufacturer is

not able to sent out all goods in time.

Due to the long processing time, ATP checks are executed on top of preaggregated totals, e.g. stock level aggregates per day. Using in-memory databases

enables ATP checks to be performed on the latest data without using pre-aggregated totals. Thus, manufacturing and Scheduling rescheduling decisions can be

taken on real-time data. Furthermore, removing aggregates simplifies the overall

system architecture significantly, while adding flexibility.



2.7



Self Test Questions



1. Compression Factor

What is the average compression factor for accounting data in an in-memory

column-oriented database?

(a)

(b)

(c)

(d)



100x

10x

50x

5x



14



2 New Requirements for Enterprise Computing



2. Data explosion

Consider the formula 1 race car tracking example, with each race car having

512 sensors, each sensor records 32 events per second whereby each event is

64 byte in size.

How much data is produced by a F1 team, if a team has two cars in the race and

the race takes 2 h?

For easier calculation, assume 1,000 byte = 1 kB, 1,000 kB = 1 MB,

1,000 MB = 1 GB.

(a)

(b)

(c)

(d)



14 GB

15.1 GB

32 GB

7.7 GB



References

[OTRK05] A. Oulasvirta, S. Tamminen, V. Roto, J. Kuorelahti, Interaction in 4-second bursts:

the fragmented nature of attentional resources in mobile hci, in Proceedings of the

SIGCHI Conference on Human Factors in Computing Systems, CHI ’05 (ACM, New

York, 2005), pp. 919–928

[Oul05]

A. Oulasvirta, The fragmentation of attention in mobile interaction, and what to do

with it. Interactions 12(6), 16–18 (2005)

[RO05]

V. Roto, A. Oulasvirta, Need for non-visual feedback with long response times in

mobile hci, in Special Interest Tracks and Posters of the 14th International

Conference on World Wide Web, WWW ’05 (ACM, New York, 2005), pp. 775–781



Chapter 3



Enterprise Application Characteristics



3.1 Diverse Applications

An enterprise data management system should be able to handle data coming from

several different source types.

• Transactional data is coming from different applications, e.g. Enterprise

Resource Planning (ERP) systems.

• The sources for event processing and stream data are machines and sensors,

typically high volume systems.

• Real-time analytics usually leverage structured data for transactional reporting,

classical analytics, planning, and simulation.

• Finally, text analytics is typically based on unstructured data coming from the

web, social networks, log files, support systems, etc.



3.2 OLTP Versus OLAP

An enterprise data management system should be able to handle transactional and

analytical query types, which differ in several dimensions. Typical queries for

Online Transaction Processing (OLTP) can be the creation of sales orders,

invoices, accounting data, the display of a sales order for a single customer, or the

display of customer master data. Online Analytical Processing (OLAP) consists

of analytical queries. Typical OLAP-style queries are dunning (payment reminder), cross selling (selling additional products or services to a customer), operational reporting, or analyzing history-based trends.

Because it has always been considered that these query types are significantly

different, it was argued to split the data management system into two separate

systems handling OLTP and OLAP queries separately. In the literature, it is

claimed that OLTP workloads are write-intensive, whereas OLAP-workloads are

read-only and that the two workloads rely on ‘‘Opposing Laws of Database

Physics’’ [Fre95].

H. Plattner, A Course in In-Memory Data Management,

DOI: 10.1007/978-3-642-36524-9_3, Ó Springer-Verlag Berlin Heidelberg 2013



15



16



3 Enterprise Application Characteristics



Yet, research in current enterprise systems showed that this statement is not true

[KGZP10, KKG+11]. The main difference between systems that handle these

query types is that OLTP systems handle more queries with a single select or

queries that are highly selective returning only a few tuples, whereas OLAP

systems calculate aggregations for only a few columns of a table, but for a large

number of tuples.

For the synchronization of the analytical system with the transactional system(s), a cost-intensive ETL (Extract-Transform-Load) process is required. The

ETL process takes a lot of time and is relatively complex, because all changes

have to be extracted from the outside source or sources if there are several, data is

transformed to fit analytical needs, and it is loaded into the target database.



3.3 Drawbacks of the Separation of OLAP from OLTP

While the separation of the database into two systems allows for specific workload

optimizations in both systems, it also has a number of drawbacks:

• The OLAP system does not have the latest data, because the latency between the

systems can range from minutes to hours, or even days.Consequently, many

decisions have to rely on stale data instead of using the latest information.

• To achieve acceptable performance, OLAP systems work with predefined,

materialized aggregates which reduce the query flexibility of the user.

• Data redundancy is high. Similar information is stored in both systems, just

differently optimized.

• The schemas of the OLTP and OLAP systems are different, which introduces

complexity for applications using both of them and for the ETL process synchronizing data between the systems.



3.4 The OLTP Versus OLAP Access Pattern Myth

The workload analysis of multiple real customer systems reveals that OLTP and

OLAP systems are not as different as expected. For OLTP systems, the lookup rate

is only 10 % higher than for OLAP systems. The number of inserts is a little higher

on the OLTP side. However, the OLAP systems are also faced with inserts, as they

have to permanently update their data. The next observation is that the number of

updates in OLTP systems is not very high [KKG+11]. In the high-tech companies

it is about 12 %. It means that about 88 % of all tuples saved in the transactional

database are never updated. In other industry sectors, research showed even lower

update rates, e.g., less than 1 % in banking and discrete manufacturing [KKG+11].



3.4 The OLTP Versus OLAP Access Pattern Myth



17



This fact leads to the assumption that updating as such or alternatively deleting

the old tuple and inserting the new one and keeping track of changes in a ‘‘side

note’’ like it is done in current systems is no longer necessary. Instead, changed or

deleted tuples can be inserted with according time stamps or invalidation flags.

The additional benefit of this insert-only approach is that the complete transactional data history and a tuple’s life cycle are saved in the database automatically.

More details about the insert-only approach will be provided in Chap. 26.

The further fact that workloads are not that different after all leads to the vision

of reuniting the two systems and to combine OLTP and OLAP data in one system.



3.5 Combining OLTP and OLAP Data

The main benefit of the combination is that both, transactional and analytical

queries can be executed on the same machine using the same set of data as a

‘‘single source of truth’’. ETL-processing becomes obsolete.

Using modern hardware, pre-computed aggregates and materialized views can

be eliminated as data aggregation can be executed on-demand and views can be

provided virtually. With the expected response time of analytical queries below

one second, it is possible to do the analytical query processing on the transactional

data directly anytime and anywhere. By dropping the pre-computation of aggregates and materialization of views, applications and data structures can be simplified, as management of aggregates and views (building, maintaining, and storing

them) is not necessary any longer.

A mixed workload combines the characteristics of OLAP and OLTP workloads.

The queries in the workload can have full row operations or retrieve only a small

number of columns. Queries can be simple or complex, pre-determined or ad hoc.

This includes analytical queries that now run on latest transactional data and are

able to see the real-time changes.



3.6 Enterprise Data Characteristics

By analyzing enterprise data, special data characteristics were identified. Most

interestingly, many attributes of a table are not used at all while table can be very

wide. 55 % of columns are unused on average per company and tables with up to

hundreds of columns exist. Many columns that are used have a low cardinality of

values, i.e., there are very few distinct values. Further, in many columns NULL or

default values are dominant, so the entropy (information containment) of these

column is very low (near zero).

These characteristics facilitate the efficient use of compression techniques,

resulting in lower memory consumption and better query performance as will be

seen in later chapters.



Xem Thêm
Tải bản đầy đủ (.pdf) (298 trang)

×