Lesson 1. General Assessment of the State of Oracle Instance.
In this lesson we are going to focus on the Main Window. It is a convenient place to look at your database and jump to the details of the problem. There are things you should know about this window that will increase the efficiency of your work.
It takes some time to get familiar with the activity in your particular database. In general you should be concerned about how much system resources are in use: I/O load; CPU utilization; redo writing. You should assess the resource cost versus the useful work performed by database: number of users connected, rate of SQL execution, number of commits, SQLNet traffic etc. This is exactly what the Main Window should be used for. The charts in this window provide the last 15-20 minutes of history. This helps visualize the trends in the database load.
If you run a 10g database, make sure that the "CPU Sys+User" statistic is selected in one of the auxiliary charts. If you use older versions, please use third party tools to track CPU utilization.
Let's look at every statistic in order of importance. Below are general recommendations; you may have your own sequence of checks depending on your database type and needs.
Let's start with the CPU - it always makes sense to check CPU first. A load <25% indicates a relatively happy system. A load 50% and higher may indicate ongoing problems: you should think about finding and fixing the SQL statements responsible for CPU load, reducing the load, or investing in faster or more CPUs.
If it is not CPU, your system may be slow due to excessive waits. Keep an eye on top waits in the tabular view in the upper part of the window. Just a reminder: if the percentage of wait time exceeds 100% - that indicates more than one session is waiting on this event. If you see a high percentage of waits, click on this view to open the Activity Explorer window to see the history of waits. Wait events is a big subject covered in Lesson 2 where the Activity Explorer is explained. Whole database tuning can be based on exploring waits, but for now let's return to the Main Window.
Based on the nature of waits there, the system assessment can go in a number of different directions; you can be prompted to focus on physical reads, redo logging, DB buffers, log archiver etc. Over time you will become familiar with all these paths. For now, let's assume that your waits are moderately low. Next, we will take a look at physical reads. Decisions based on physical reads will depend on the type of database (OLTP, data warehouse, or hybrid) and the throughput of your system. Remember that Oracle has two types of physical reads: multi-block reads (db file scattered) used in full scans and single-block reads (db file sequential) used in index-based access. Therefore, a large number of blocks reads per second may be normal for multi-block reads but may be an indication of a problem for single-block reads due to saturation of the I/O subsystem by a large number of small I/O. If you have I/O problems, they will also show up in the database waits mentioned in the previous paragraph.
If you have parallel queries running in your system, you will see I/O activity in the "Direct Physical Reads" chart. You should also see busy PQ servers in the "PQ Servers" chart. These charts are helpful in monitoring data warehouse activity. Data warehouse environment will also typically see writes and reads into / from temporary tablespace(s) during large sorts and hash joins. The amount of temporary tablespace used for this activity can be seen in "Temporary Tablespaces info" on the right.
Let's imagine you are concerned about high I/O load and want more details. Click on the Physical Reads graph to open a larger graph window with more details. You can make the graphical chart bigger by resizing the window and you can change time resolution to see larger period of time or scroll to the past. A quick tip: in order to scroll the graph horizontally, hold the right mouse button on the graphical area and move left or right. If you move up or down, the vertical scale will change.
While having the statistic detailed graph open, let's assume that load has been uneven: there were periods of abnormally higher load. You may need even more details about that period of time. First, select the period of time around the spike by highlighting with the mouse. Then, right click to open a pop-up menu. The first two items "Show in Activity Explorer" and "Show in SQL Explorer" call two very powerful tools to explore database activity in high detail. Select the first choice to open the Activity Explorer. Please note that the same time interval is already selected. All data in the Activity Explorer is summarized for the time period of interest. The SQL Explorer can be called in a similar manner and will also retain the selected time interval.
The previous paragraph demonstrated a very powerful way of troubleshooting
a database by using Lab128. Troubleshooting can be narrowed from
the general "big" picture of the database activity in the Main Window,
down to the detailed statistics view, focusing on specific time intervals
and calling the Activity or SQL explorers.
This, of course, applies to any statistics graph in the Main Window, not only physical reads. The same principle also applies to the graphs in other windows, not just the Main Window. For now, let's continue our tour of the Main Window. If you are in some other window, press the "Home" button in the toolbar, or press the "Back" arrow until you get back to the Main Window.
Let's take a look at the left part of the Main Window. It shows the network activity and the number of user sessions connected to the database. These statistics are self-explanatory. Typically, you should be concerned about the number of active sessions. A high number of active sessions usually indicate a problem. Note that "Users" and "Active" are underlined, which means they are links to other windows. Click on either link to open the Sessions window. This window can also be opened by using the button in the tool bar, or by using the main menu. This demonstrates the behavior of links in Lab128. The links have same appearance and behave the same way throughout the application.
Now return to the Main Window. There are more links in this window. Many of them are in the central section: "Buffer Explorer", "All Ratios", "Library Cache Ratios", "Redo log wait charts". If you followed this lesson, you already know that the Activity Explorer can be called by clicking on the Wait Event tabular view. Other tabular views, located in the right part of the window, also serve as links to the corresponding windows: "Tablespaces & Datafiles", "Transaction and RBS", "Temp segment".
The transactional activity is represented in the Main Window by the "Redo Entries", "Redo Write", "Rollback Writes", "Log Sync Wait" graphs. The redo log writer and archiver activity is monitored by the "Redo Log Written", number of "ARC" processes, "Switch Rate", and "Not Archived" graphs in the bottom part of the window. To see the description of the monitored statistics, right-click on the graph and select "About Statistics". This is an excellent way to refresh your memory about the most important Oracle performance statistics.
In conclusion, the Main Window provides the opportunity to explore activity history by scrolling back in time using the scroll bar at the bottom of the window. When scrolling, a time window appears showing the time. Casual scrolling back in time can be a bit more difficult than opening a detailed graphical chart window for individual statistics and scrolling back in time there. The detailed chart window is very easy to scroll. But if you already know a particular time to investigate, it is highly recommended to scroll the Main Window to that point to get a comprehensive view of the database state.
Lesson Summary. The Main window is a convenient place to start exploring the database. It has many links to call other windows in the application. Statistic graphs can be "magnified" by clicking and opening a detailed graphical chart. This very powerful technique has been described as drilling down the general "big" picture to detailed statistics views, then selecting a time interval and calling Activity or SQL Explorers.