Oracle 11g R2 Performance Tuning : 2014

Friday, March 14, 2014

The V$ACTIVE_SESSION_HISTORY view provides sampled session activity in the instance. Active sessions are sampled every second and are stored in a circular buffer in SGA. Any session that is connected to the database and is waiting for an event that does not belong to the Idle wait class is considered as an active session. This includes any session that was on the CPU at the time of sampling.

Topics

Refer the links below:

Performance Analysis with the Oracle Active Session History

Practical Use of Active Session History

Generating ASH Reports

Through SQL

SQL> define dbid = '';

SQL> define inst_num = '';

SQL> define report_type = 'html';

SQL> define begin_time = '09:00';

SQL> define duration = 480;

SQL> define report_name = '/tmp/sql_ashrpt.txt';

SQL> define slot_width = '';

SQL> define target_session_id = '';

SQL> define target_sql_id = 'abcdefghij123';

SQL> define target_wait_class = '';

SQL> define target_service_hash = '';

SQL> define target_module_name = '';

SQL> define target_action_name = '';

SQL> define target_client_id = '';

SQL> @?/rdbms/admin/ashrpti

ASH Report Activity Over Time

One of the most informative sections of the ASH Report is the Activity Over Time section. In this section, the ASH report time span is divided into 10 time slots. If the time period is too short or the data too sparse, the resulting report has fewer slots. Slots 2 through 9 are defined as an integer number of minutes of all the same size for easy comparison. So, it is best to compare the inner slots to one another.

Using the Activity Over Time section, you can perform skew analysis by looking for spikes in the Event Count column. This would indicate an increase in the number of processes waiting for a particular event. A spike in the Slot Count indicates an increase in active sessions because ASH data is sampled from active sessions only. In the pictured example, note the circled event. The number of active session samples has increased, and the number of sessions associated with the buffer busy waits event has also spiked. This kind of skew in a slot possibly represents the root cause of the problem being investigated.

ASH Report Structure

The ASH Report follows the pattern of the AWR report. Starting from the upper right of the slide, the report sections are as follows:

Top Events: Reports the user events, background events, and the event parameter values

Load Profile: Reports the top service/module, top clients, identifies the type of SQL commands and top phases of execution

Top SQL: Reports top SQL statements associated with the top events, SQL associated with the top rowsources, top SQL using literals, and the SQL text for these SQL statements.

Top PL/SQL Procedures: Lists the PL/SQL procedures that accounted for the highest percentages of sampled session activity

Top Java Workload: Describes the top Java programs in the sampled session activity

Top Sessions: Reports the top sessions found waiting, top blocking sessions, and aggregates for PQ sessions

Top Objects/Files/Latches: Reports the top objects, files, or latches that were involved in a wait

Activity Over Time: Reports the top three wait events for 10 equally sized time periods during the report period

Active Session History Mechanics

It would be very expensive to record all activities of all sessions. The ASH extracts only samples of information from V$SESSION. One sample is extracted every second. This is efficiently done without using SQL.

ASH is designed as a rolling buffer in memory, and previous information is overwritten when needed. The volume of the data in the ASH buffer can be very large, and flushing it all to disk is unacceptable. A more efficient approach is to filter the history data while flushing it to the workload repository.

This is done automatically by the manageability monitor (MMON) process every 60 minutes, and by the manageability monitor light (MMNL) process whenever the buffer is full.

Note: The memory for the ASH comes from the System Global Area (SGA), and it is fixed for the lifetime of the instance. It represents 2 MB of memory per CPU. The ASH cannot exceed a maximum bound of five percent of the shared pool size, or five percent of the SGA_TARGET.

ASH statistics are available through the V$ACTIVE_SESSION_HISTORY fixed view. This view contains one row for each active session per sample. All columns that describe the session in the ASH are present in the V$SESSION view.

Automatic Database Diagnostic Monitor

When problems occur with a system, it is important to perform accurate and timely diagnosis of the problem before making any changes to a system. Oftentimes, a database administrator (DBA) simply looks at the symptoms and immediately starts changing the system to fix those symptoms. However, an accurate diagnosis of the actual problem in the initial stage significantly increases the probability of success in resolving the problem.

With Oracle Database, the statistical data needed for accurate diagnosis of a problem is stored in the Automatic Workload Repository (AWR). The Automatic Database Diagnostic Monitor (ADDM):

Analyzes the AWR data on a regular basis
Diagnoses the root causes of performance problems
Provides recommendations for correcting any problems
Identifies non-problem areas of the system

Because AWR is a repository of historical performance data, ADDM can analyze performance issues after the event, often saving time and resources in reproducing a problem.

In most cases, ADDM output should be the first place that a DBA looks when notified of a performance problem. ADDM provides the following benefits:

Automatic performance diagnostic report every hour by default
Problem diagnosis based on decades of tuning expertise
Time-based quantification of problem impacts and recommendation benefits
Identification of root cause, not symptoms
Recommendations for treating the root causes of problems
Identification of non-problem areas of the system
Minimal overhead to the system during the diagnostic process

It is important to realize that tuning is an iterative process, and fixing one problem can cause the bottleneck to shift to another part of the system. Even with the benefit of ADDM analysis, it can take multiple tuning cycles to reach acceptable system performance. ADDM benefits apply beyond production systems; on development and test systems, ADDM can provide an early warning of performance issues.

Topics

Refer the links below:
Automatic Performance Diagnostics
Diagnosing and Resolving Performance Problems Using ADDM

Thursday, March 13, 2014

ADDM with Oracle Real Application Clusters

If you are using Oracle RAC, you can run ADDM in Database analysis mode to analyze the throughput performance of all instances of the database. In Database mode, ADDM considers DB time as the sum of the database time for all database instances. Using the Database analysis mode enables you to view all findings that are significant to the entire database in a single report, instead of reviewing a separate report for each instance.

The Database mode report includes findings about database resources (such as I/O and interconnect). The report also aggregates findings from the various instances if they are significant to the entire database. For example, if the CPU load on a single instance is high enough to affect the entire database, the finding will appear in the Database mode analysis, which will point to the particular instance responsible for the problem.

Automatic Database Diagnostic Monitor and Oracle RAC Performance

The Automatic Database Diagnostic Monitor (ADDM) is a self-diagnostic engine built into the Oracle Database. ADDM examines and analyzes data captured in the Automatic Workload Repository (AWR) to determine possible performance problems in Oracle Database. ADDM then locates the root causes of the performance problems, provides recommendations for correcting them, and quantifies the expected benefits. ADDM analyzes the AWR data for performance problems at both the database and the instance level.

An ADDM analysis is performed as each AWR snapshot is generated, which is every hour by default. The results are saved in the database and can be viewed by using Enterprise Manager. Any time you have a performance problem, you should first review the results of the ADDM analysis. An ADDM analysis is performed from the top down, first identifying symptoms, then refining the analysis to reach the root causes, and finally providing remedies for the problems.

For the clusterwide analysis, Enterprise Manager reports two types of findings:

Database findings: An issue that concerns a resource that is shared by all instances in the cluster database, or an issue that affects multiple instances. An example of a database finding is I/O contention on the disk system used for shared storage.

Instance findings: An issue that concerns the hardware or software that is available for only one instance, or an issue that typically affects just a single instance. Examples of instance findings are high CPU load or sub-optimal memory allocation.

Description of addm_cluster_findings.gif follows

ADDM reports only the findings that are significant, or findings that take up a significant amount of instance or database time. Instance time is the amount of time spent using a resource due to a performance issue for a single instance and database time is the sum of time spent using a resource due to a performance issue for all instances of the database, excluding any Oracle Automatic Storage Management (Oracle ASM) instances.

An instance finding can be reported as a database finding if it relates to a significant amount of database time. For example, if one instance spends 900 minutes using the CPU, and the sum of all time spent using the CPU for the cluster database is 1040 minutes, then this finding would be reported as a database finding because it takes up a significant amount of database time.

A problem finding can be associated with a list of recommendations for reducing the impact of the performance problem. Each recommendation has a benefit that is an estimate of the portion of database time that can be saved if the recommendation is implemented. A list of recommendations can contain various alternatives for solving the same problem; you do not have to apply the recommendations.

Recommendations are composed of actions and rationales. You must apply all the actions of a recommendation to gain the estimated benefit of that recommendation. The rationales explain why the actions were recommended, and provide additional information to implement the suggested recommendation.

Viewing ADDM for Oracle RAC Findings

Wednesday, March 12, 2014

Automated Maintenance Tasks

With Oracle Database 11g, the Automated Maintenance Tasks feature relies on the Resource Manager being enabled during the maintenance windows.

The resource plan associated with the window is automatically enabled when the window opens.

The goal is to prevent maintenance work from consuming excessive amounts of system resources.Each maintenance window is associated with a resource plan that specifies how the resources will be allocated during the window duration.

In Oracle Database 11g, WEEKNIGHT_WINDOW and WEEKEND_WINDOW (defined in Oracle Database 10g) are replaced with daily maintenance windows.

Automated Maintenance Task Priorities

The Automated Maintenance Tasks feature is implemented by Autotask Background Process (ABP). ABP functions as an intermediary between automated tasks and the Scheduler.
Its main purpose is to translate automated tasks into AUTOTASK jobs for execution by the Scheduler.
Just as important, ABP maintains a history of execution of all tasks. ABP stores its private repository in the SYSAUX tablespace; you view the repository through DBA_AUTOTASK_TASK.
ABP is started by MMON at the start of a maintenance window. There is only one ABP required for all instances. The MMON process monitors ABP and restarts it if necessary.
ABP assigns jobs to various Scheduler job classes. These job classes map the job to a consumer group based on priority.
The Automatic Maintenance Tasks feature determines when—and in what order—tasks are performed. As a DBA, you can control the following:

If the maintenance window turns out to be inadequate for the maintenance workload, adjust the duration and start time of the maintenance window.
Control the resource plan that allocates resources to the automated maintenance tasks during each window.
Enable or disable individual tasks in some or all maintenance windows.
In a RAC environment, shift maintenance work to one or more instances by mapping maintenance work to a service. Enabling the service on a subset of instances shifts maintenance work to these instances.

Enterprise Manager is the preferred way to control Automatic Maintenance Tasks. However, you can also use the DBMS_AUTO_TASK_ADMIN package.

AWR Baseline

AWR Baselines

AWR Baseline contains a set of AWR snapshots for an “interesting or reference” period of time.

Monitor the current state of the database instance and compare it to a previous state.
Examine AWR or Statspack reports and instance files carefully.

Objectives

Create AWR baselines
Enable adaptive thresholds
Create AWR baselines for future time periods

Topics

Refer the links below:

Resolving Performance Degradation Over Time
Using Automatic Workload Repository for Database Tuning: Tips for Expert DBAs

Defining a Problem

Defining Problems

Problems can arise at any time. A proactive DBA watches for problems and corrects them before they are noticed by users. In the past, the discovery and definition step has been tedious and frequently dependent on listening to user feedback. User feedback is important, but is often subjective and not reproducible. In Oracle Database 11g, many of the information sources can be viewed from the Enterprise Manager interface.

Monitor the current state of the database instance and compare it to a previous state.
Examine AWR or Statspack reports and instance files carefully.

Objectives

Identify performance issues
Set tuning priorities
Interpret tuning diagnostics
Tune for life cycle phase

Topics

Refer the links below:

Performance Tuning Overview
Performance Tuning Exam(1Z0-054)

Monday, February 24, 2014

Performance Versus Business Requirements

Factors that affect performance:

Frequent checkpointing
Performing archiving
Block check sums
Redundancy

Frequent backups of data files
Multiple control files
Multiple redo log members in a group

Security

Auditing
Encryption
Virtual Private Database/Fine Grained Access Control

There is always a cost of doing business in certain ways. Often the requirements of the business can impact performance. The impact of downtime and crash recovery, and even the unlikely event of block corruption, must be considered against the overhead of protecting against these events. Redundancy improves availability but requires more I/Os

The question is not whether to use the security features but what is required, and use only those features.

Often the business can only tolerate minimal downtime, so you configure the database for the business, and then tune your database accordingly. The uptime requirement, the mean time to recovery, and the amount of data that could be lost in a disk or system crash are all business issues.

ADDM Tuning Session

ADDM consolidates the manual tuning session to quickly identify the areas that produce the greatest benefit. ADDM runs automatically each time a AWR snapshot is taken. The findings of the latest ADDM report is displayed on the Database Home page of Enterprise Manager.

Tuning During the Life Cycle

An application’s life cycle can be divided into different phases:

Application design and development
Testing: Database configuration
Deployment: Adding a new application to an existing database
Production: Troubleshooting and tuning
Migration, upgrade, and environment changes

Tuning can be divided into two classes:

Proactive (make it better, so it will not break)

Test scenarios.
Find the problem areas.
Resolve the problem.

Reactive (wait until it breaks, then fix it)

Monitor active instance.
Tune issues as needed.

Tuning during the life cycle involves two courses of action: proactive tuning or reactive. During the design, development, and testing phases tuning is mostly proactive; that is, scenarios and test cases are designed and tested. The results are measured and compared against other configurations. In the deployment and production environments, the tuning is mostly reactive. The need for hypothetical loads is removed as actual users and workloads are created, but the ability to anticipate problems also diminishes. You can monitor the database instance to observe changes and trends in performance metrics. From the information you gather by monitoring, you may be able to mitigate performance issues before they are noticed by users.

The DBA may be involved in tuning from the earliest stages of design and development. It is less expensive to correct bugs and performance problems early in the life cycle than later. The differences in tuning the later phases of the life cycle are primarily in what is allowed. Many DBAs in a production environment are not allowed to change the SQL or data structures. However, a design change to improve performance may warrant a change request to the application vendor or development team.

Common Tuning Problems

The most common tuning problems:

Inefficient or high-load SQL statements
Suboptimal use of Oracle Database by the application
Undersized memory structures
Concurrency issues
I/O issues
Database configuration issues
Short-lived performance problems
Degradation of database performance over time
Unexpected performance regression after environment changes
Locking issues

Tuning inefficient or high-load SQL statements has an wide impact that can reduce memory use, CPU, and IO resources. SQL tuning issues includes poorly written SQL, ineffective use of indexes, access path costs, and sorting. In this course, we assume that the DBA has little or no opportunity to change the SQL statements.

Problems such as establishing new database connections repeatedly, excessive SQL parsing, and high levels of contention for a small amount of data (also known as application-level block contention) can degrade the application performance significantly. These are all poor use of the database by the application.

Memory issues are high on the list of instance tuning problems. Proper sizing of the System Global Area (SGA) including the Shared pool and Buffer cache, and the Process Global Area (PGA) reduce contention for memory resources and indirectly reduces IO and CPU.

A high degree of concurrent activities, multiple processes, or users might result in contention for shared resources that can be manifested in the forms of various types of waits. Many resources can only be accessed by only one process at a time. Several processes attempting to access the same resource creates contention.

In any database, I/O issues, such as database file layout on disk or RAID devices, can be a source of performance problems. In OLTP applications, the amount of redo and undo generated can create bottlenecks in memory or I/O.

Some problems are reported by users but may not be apparent from reports that span intervals of 30 minutes or longer. The Oracle Database has additional tools (Active Session History ASH) that allow the DBA to view statistics and metrics over small segments of time in the recent past.

Many databases will have gradual changes: the number of user, the amount of data, the number of reports, and modules in use. These changes may lead to a degradation in performance. The proactive DBA will capture and save statistics sets from when the database is performing acceptably, to compare with statistics when the database performance is poor to identify the differences.

The environment of the database is seldom static. Patches, upgrades, new hardware, or changes to the instance parameters can change the performance of the database. Sometimes a change improves performance in one area and causes another area to degrade.

Locking issues are not common problems, but when you have locking issues they become very important.

Top SQL Reports

The following Top SQL sections sort the SQL statements with the top resource usage in multiple ways, as indicated by their titles:

SQL ordered by Elapsed Time
SQL ordered by CPU Time
SQL ordered by Gets
SQL ordered by Reads
SQL ordered by Executions
SQL ordered by Parse Calls
SQL ordered by Sharable Memory
SQL ordered by Version Count

These sorts allow you to find the troublesome SQL statement. You can also view the Complete List of SQL Text section to view the entire SQL text. In this example, a single SQL statement is responsible for almost all of the instance activity.

Setting the Priority

Determine which problem to tune first. In the performance reports, you see many statistics; even a well-tuned database shows a set of top wait events. The Oracle server provides a set of wait event statistics for processes that are idle or waiting. The Oracle server also records CPU utilization for processes that are running. To determine the impact of a particular event, it must be compared with the overall time spent.

Each request to the database server has a response time consisting of a wait time and a service time. The service time is the time spent actively working on the request (CPU time). The wait time is by definition the time waiting for any reason. Both service time and wait time may be tuned.

To tune the service time something has to change: the processing, the SQL, the access path, or the data storage structure. Wait times can be tuned by reducing contention for the resource where the wait is occurring.

How to determine the priority

You can take help from AWR reports.

The top wait events always have some values. In example shown in the slide, the users are complaining of slow response time. The Top 5 Timed Foreground Events only shows that the instance is using the CPU.

The Instance CPU section show that the instance is using 65% of the OS CPU; this information is inconclusive. The instance is supposed to use CPU and not wait. This set of diagnostics may mean that the instance is CPU bound. As pointed out earlier, performance tuning can either reduce wait time or service time. In this case, the service time needs to be reduced. To reduce service time, SQL is the usual area to be examined.

Setting the Priority: Example

The top wait events did not give a clear direction. So you continue with the time model to find which areas are consuming the DB time. You can determine the top-priority tuning tasks by comparing the time spent in various waits and tasks with the overall wait time and service time. Both major tools report the Time Model Statistics to guide your tuning efforts. For example, the AWR report excerpt in the slide shows that the database CPU time (time in user calls) is 474.04 seconds. The time spent in user calls is 44.85% of the total DB time. The “sql execute elapsed time” shows 1050.06 seconds; this time includes wait times. Just from this limited view, the wait times for the SQL execution are significant, and would lead you to examine the wait statistics related to the SQL execution and the SQL reports to identify individual SQL statements for tuning.

The “% of DB Time” values indicate a relative impact tuning this area could have. If the “sql execute elapsed time” could be reduced, then the maximum possible improvement is 1050 seconds or 99%. SQL will always take some time to execute. Therefore, the actual improvement may be much less, depending on the amount of improvement you can get from that area.

Defining the Problem

Problems can arise at any time. A proactive DBA watches for problems and corrects them before they are noticed by users. In the past, the discovery and definition step has been tedious and frequently dependent on listening to user feedback. User feedback is important, but is often subjective and not reproducible. In Oracle Database 11g, many of the following information sources can be viewed from the Enterprise Manager interface:

Monitor the current state of the database instance and compare it to a previous state.

Use Statspack or AWR to collect performance metrics regularly. Changes can point to issues before they become noticeable to users.
Use OS or EM tools to check for CPU and disk queuing, disk utilization, and memory swapping. These are the signs of an overloaded system.

Examine AWR or Statspack reports and instance files carefully.

Use the available tools, such as Statspack or AWR reports, to identify SQL statements in the applications that are consuming the most resources. Have these changed?
Check the alert logs, and trace files for error messages that might give a quick clue to the nature of the problem. Do not overlook system- and application-specific logs.
Ensure that the initialization parameter settings make sense for the system.
Collect instance and OS statistics. Statspack reports point to components where the greatest waits and the greatest use of resources occur. ADDM goes further by focusing on those components with the greatest potential benefit

Sunday, February 23, 2014

Comparative Performance Analysis with AWR Baselines

What is the proper threshold to set on a performance metric? What is it that you want to detect? If you want to know that the performance metric value indicates that the server is nearing capacity, an absolute value is correct. But if you want to know that the performance is different today than it was at this time last week, or last month, the current performance must be compared to a baseline.

A baseline is a set of snapshots taken over a period of time. These snapshots are grouped statistically to yield a set of baseline values that vary over time. For example, the number of transactions per second in a certain database varies depending on the time of the day. The values for transactions per second are higher during working hours and lower during nonworking hours. The baseline records this variation and can be set to alert you if the current number of transactions per second is significantly different from the baseline values.

Oracle Database 11g baselines provide the data required to calculate time-varying thresholds based on the baseline data. The baseline allows a real-time comparison of performance metrics with baseline data and can be used to produce AWR reports that compare two periods.

What is the proper threshold to set on a performance metric?

What is it that you want to detect?

If you want to know that the performance metric value indicates that the server is nearing capacity, an absolute value is correct.

But if you want to know that the performance is different today than it was at this time last week, or last month, the current performance must be compared to a baseline.

Using EM to Configure and Changing Adaptive Threshold Settings

Steps to Configure and Change Adaptive Threshold Settings

Create the baseline
Schedule the computation of statistics
After AWR baseline statistics are computed for a particular baseline, you can set metric thresholds specific to your baseline.

Compute baseline statistics directly from the Baselines page (as previously discussed). Then go to the AWR Baseline Metric Thresholds page and select the type of metrics that you want to set. When done, select a specific metric and click Edit Thresholds.

On the corresponding Edit AWR Baseline Metric Thresholds page, specify your thresholds in the Thresholds Settings section, and then click Apply Thresholds.

You can specify thresholds based on the statistics computed for your baseline. This is illustrated in the slide. In addition to “Significance Level,” the other possibilities are “Percentage of Maximum” and “Fixed Values.”

Note: After a threshold is set using Baseline Metric Thresholds, the previous threshold values are forgotten forever and the statistics from the associated baseline are used to determine the threshold values until they are cleared (by using the Baseline Metric Threshold UI or PL/SQL interface).