X » 日志 » Tuning Oracle Sorting
Tuning Oracle Sorting
Robert 发表于 2008-06-10 23:53:02
As a small but very important component of SQL syntax, sorting is a frequently overlooked aspect of Oracle tuning. In general, an Oracle database will automatically perform sorting operations on row data as requested by a create index or an SQL ORDER BY or GROUP BY statement. In general, Oracle sorting occurs under the following circumstances:
-
SQL using the ORDER BY clause
-
SQL using the GROUP BY clause
-
When an index is created
-
When a MERGE SORT is invoked by the SQL optimizer because inadequate indexes exist for a table join
At the time a session is established with Oracle, a private sort area is allocated in RAM memory for use by the session for sorting. If the connection is via a dedicated connection, a Program Global Area (PGA) is allocated according to the sort_area_size Oracle parameter. For connections via the multithreaded server, sort space is allocated in the large_pool. Unfortunately, the amount of memory used in sorting must be the same for all sessions, and it is not possible to add additional sort areas for tasks that require large sort operations. Therefore, the designer must strike a balance between allocating enough sort area to avoid disk sorts for the large sorting tasks, keeping in mind that the extra sort area will be allocated and not used by tasks that do not require intensive sorting. Of course, sorts that cannot fit into the sort_area_size will be paged out into the TEMP tablespaces for a disk sort. Disk sorts are about 14,000 times slower than memory sorts.
As we noted, the size of the private sort area is determined by the sort_area_size Oracle parameter. The size for each individual sort is specified by the sort_area_ retained_size Oracle parameter. Whenever a sort cannot be completed within the assigned space, a disk sort is invoked using the temporary tablespace for the Oracle instance.
Disk sorts are expensive for several reasons. First, they are extremely slow when compared to an in-memory sort. Also, a disk sort consumes resources in the temporary tablespace. Oracle must also allocate buffer pool blocks to hold the blocks in the temporary tablespace. In-memory sorts are always preferable to disk sorts, and disk sorts will surely slow down an individual task as well as impact concurrent tasks on the Oracle instance. Also, excessive disk sorting will cause a high value for free buffer waits, paging other tasks' data blocks out of the buffer.
The following STATSPACK query uses the stats$sysstat table. From this table we can get an accurate picture of memory and disk sorts.
rpt_sorts_alert.sql
set pages 9999;
column mydate heading 'Yr. Mo Dy Hr.' format a16
column sorts_memory format 999,999,999
column sorts_disk format 999,999,999
column ratio format .99999
select
to_char(snap_time,'yyyy-mm-dd HH24') mydate,
newmem.value-oldmem.value sorts_memory,
newdsk.value-olddsk.value sorts_disk,
((newdsk.value-olddsk.value)/(newmem.value-oldmem.value)) ratio
from
perfstat.stats$sysstat oldmem,
perfstat.stats$sysstat newmem,
perfstat.stats$sysstat newdsk,
perfstat.stats$sysstat olddsk,
perfstat.stats$snapshot sn
where
newdsk.snap_id = sn.snap_id
and
olddsk.snap_id = sn.snap_id-1
and
newmem.snap_id = sn.snap_id
and
oldmem.snap_id = sn.snap_id-1
and
oldmem.name = 'sorts (memory)'
and
newmem.name = 'sorts (memory)'
and
olddsk.name = 'sorts (disk)'
and
newdsk.name = 'sorts (disk)'
and
newmem.value-oldmem.value > 0
and
newdsk.value-olddsk.value > 100
;
Here is the output from the script. Here, we can clearly see the number of memory sorts and disk sorts, and the ratio of disk to memory sorts.
Yr. Mo Dy Hr. SORTS_MEMORY SORTS_DISK RATIO
---------------- ------------ ------------ -------
2001-12-20 12 13,166 166 .01261
2001-12-20 16 25,694 223 .00868
2001-12-21 10 99,183 215 .00217
2001-12-21 15 13,662 130 .00952
2001-12-21 16 17,004 192 .01129
2001-12-22 10 18,900 141 .00746
2001-12-22 11 19,487 131 .00672
2001-12-26 12 12,502 147 .01176
2001-12-27 13 20,338 118 .00580
2001-12-27 18 11,032 119 .01079
2001-12-28 16 16,514 205 .01241
2001-12-29 10 17,327 242 .01397
2001-12-29 16 50,874 167 .00328
2001-01-02 08 15,574 108 .00693
2001-01-02 10 39,052 136 .00348
2001-01-03 11 13,193 153 .01160
2001-01-03 13 19,901 104 .00523
2001-01-03 15 19,929 130 .00652
This report can be changed to send an alert when the number of disk sorts exceeds a predefined threshold, and we can also modify it to plot average sorts by hour of the day and day of the week. The script here computes average sorts, ordered by hour of the day:
rpt_avg_sorts_hr.sql
set pages 9999;
column sorts_memory format 999,999,999
column sorts_disk format 999,999,999
column ratio format .99999
select
to_char(snap_time,'HH24'),
avg(newmem.value-oldmem.value) sorts_memory,
avg(newdsk.value-olddsk.value) sorts_disk
from
perfstat.stats$sysstat oldmem,
perfstat.stats$sysstat newmem,
perfstat.stats$sysstat newdsk,
perfstat.stats$sysstat olddsk,
perfstat.stats$snapshot sn
where
newdsk.snap_id = sn.snap_id
and
olddsk.snap_id = sn.snap_id-1
and
newmem.snap_id = sn.snap_id
and
oldmem.snap_id = sn.snap_id-1
and
oldmem.name = 'sorts (memory)'
and
newmem.name = 'sorts (memory)'
and
olddsk.name = 'sorts (disk)'
and
newdsk.name = 'sorts (disk)'
and
newmem.value-oldmem.value > 0
group by
to_char(snap_time,'HH24')
;
Here is the output from the script. We can now take this data and create a graph in a spreadsheet.
TO SORTS_MEMORY SORTS_DISK
-- ------------ ------------
00 18,855 11
01 19,546 15
02 10,128 5
03 6,503 8
04 10,410 4
05 8,920 5
06 8,302 7
07 9,124 27
08 13,492 71
09 19,449 55
10 19,812 106
11 17,332 78
12 20,566 76
13 17,130 46
14 19,071 61
15 19,494 68
16 20,701 79
17 19,478 44
18 23,364 29
19 13,626 20
20 11,937 17
21 8,467 7
22 8,432 10
23 11,587 10
Here is the plot from the output (Figure 9-16). Here we see a typical increase in sort activity during the online period of the day. Sorts rise about 8:00 a.m. and then go down after 6:00 p.m.
Figure 9-70: Average memory sorts by hour of the day
Now, let's run the script to compute the averages by the day of the week.
rpt_avg_sorts_dy.sql
set pages 9999;
column sorts_memory format 999,999,999
column sorts_disk format 999,999,999
column ratio format .99999
select
to_char(snap_time,'day') DAY,
avg(newmem.value-oldmem.value) sorts_memory,
avg(newdsk.value-olddsk.value) sorts_disk
from
perfstat.stats$sysstat oldmem,
perfstat.stats$sysstat newmem,
perfstat.stats$sysstat newdsk,
perfstat.stats$sysstat olddsk,
perfstat.stats$snapshot sn
where
newdsk.snap_id = sn.snap_id
and
olddsk.snap_id = sn.snap_id-1
and
newmem.snap_id = sn.snap_id
and
oldmem.snap_id = sn.snap_id-1
and
oldmem.name = 'sorts (memory)'
and
newmem.name = 'sorts (memory)'
and
olddsk.name = 'sorts (disk)'
and
newdsk.name = 'sorts (disk)'
and
newmem.value-oldmem.value > 0
group by
to_char(snap_time,'day')
;
Again, we will take the result set and plot it in a chart. This time, let's plot the
disk sorts.
DAY SORTS_MEMORY SORTS_DISK
--------- ------------ ------------
friday 12,545 54
monday 14,352 29
saturday 12,430 2
sunday 13,807 4
thursday 17,042 47
tuesday 15,172 78
wednesday 14,650 43
Figure 9-17 shows the graph. In this database, the activity pattern on Tuesday shows a large number of disk sorts, with another smaller spike on Thursdays. For this database, the DBA may want to pay careful attention to the TEMP tablespaces on these days, and perhaps issue a alter tablespace TEMP coalesce; to create continuous extents in the TEMP tablespace.
Figure 9-71: Average disk sorts by day of the week
At the risk of being redundant, we need to reemphasize that the single most important factor in the performance of any Oracle database is the minimization of disk I/O. Hence, the tuning of the Oracle sorting remains one of the most important considerations in the tuning of any Oracle database.
Now, let's turn our attention to the Oracle rollback segments and see how we can use STATSPACK to monitor the rollback segments and tune them for optimal performance.
转自:http://www.praetoriate.com/t_%20tuning_sorting.htm
