這裡有數個可調校的項目可以調整,來讓 ZFS 在面對各種工作都能以最佳狀況運作。
vfs.zfs.arc_max
- Maximum size of the ARC.
The default is all RAM but 1 GB,
or 5/8 of all RAM, whichever is more.
However, a lower value should be used if the system will
be running any other daemons or processes that may require
memory. This value can be adjusted at runtime with
sysctl(8) and can be set in
/boot/loader.conf
or
/etc/sysctl.conf
.
vfs.zfs.arc_meta_limit
- Limit the portion of the
ARC
that can be used to store metadata. The default is one
fourth of vfs.zfs.arc_max
. Increasing
this value will improve performance if the workload
involves operations on a large number of files and
directories, or frequent metadata operations, at the cost
of less file data fitting in the ARC.
This value can be adjusted at runtime with sysctl(8)
and can be set in
/boot/loader.conf
or
/etc/sysctl.conf
.
vfs.zfs.arc_min
- Minimum size of the ARC.
The default is one half of
vfs.zfs.arc_meta_limit
. Adjust this
value to prevent other applications from pressuring out
the entire ARC.
This value can be adjusted at runtime with sysctl(8)
and can be set in
/boot/loader.conf
or
/etc/sysctl.conf
.
vfs.zfs.vdev.cache.size
- A preallocated amount of memory reserved as a cache for
each device in the pool. The total amount of memory used
will be this value multiplied by the number of devices.
This value can only be adjusted at boot time, and is set
in /boot/loader.conf
.
vfs.zfs.min_auto_ashift
- Minimum ashift
(sector size) that
will be used automatically at pool creation time. The
value is a power of two. The default value of
9
represents
2^9 = 512
, a sector size of 512 bytes.
To avoid write amplification and get
the best performance, set this value to the largest sector
size used by a device in the pool.
Many drives have 4 KB sectors. Using the default
ashift
of 9
with
these drives results in write amplification on these
devices. Data that could be contained in a single
4 KB write must instead be written in eight 512-byte
writes. ZFS tries to read the native
sector size from all devices when creating a pool, but
many drives with 4 KB sectors report that their
sectors are 512 bytes for compatibility. Setting
vfs.zfs.min_auto_ashift
to
12
(2^12 = 4096
)
before creating a pool forces ZFS to
use 4 KB blocks for best performance on these
drives.
Forcing 4 KB blocks is also useful on pools where
disk upgrades are planned. Future disks are likely to use
4 KB sectors, and ashift
values
cannot be changed after a pool is created.
In some specific cases, the smaller 512-byte block size might be preferable. When used with 512-byte disks for databases, or as storage for virtual machines, less data is transferred during small random reads. This can provide better performance, especially when using a smaller ZFS record size.
vfs.zfs.prefetch_disable
- Disable prefetch. A value of 0
is
enabled and 1
is disabled. The default
is 0
, unless the system has less than
4 GB of RAM. Prefetch works by
reading larger blocks than were requested into the
ARC
in hopes that the data will be needed soon. If the
workload has a large number of random reads, disabling
prefetch may actually improve performance by reducing
unnecessary reads. This value can be adjusted at any time
with sysctl(8).
vfs.zfs.vdev.trim_on_init
- Control whether new devices added to the pool have the
TRIM
command run on them. This ensures
the best performance and longevity for
SSDs, but takes extra time. If the
device has already been secure erased, disabling this
setting will make the addition of the new device faster.
This value can be adjusted at any time with
sysctl(8).
vfs.zfs.vdev.max_pending
- Limit the number of pending I/O requests per device.
A higher value will keep the device command queue full
and may give higher throughput. A lower value will reduce
latency. This value can be adjusted at any time with
sysctl(8).
vfs.zfs.top_maxinflight
- Maxmimum number of outstanding I/Os per top-level
vdev. Limits the
depth of the command queue to prevent high latency. The
limit is per top-level vdev, meaning the limit applies to
each mirror,
RAID-Z, or
other vdev independently. This value can be adjusted at
any time with sysctl(8).
vfs.zfs.l2arc_write_max
- Limit the amount of data written to the L2ARC
per second. This tunable is designed to extend the
longevity of SSDs by limiting the
amount of data written to the device. This value can be
adjusted at any time with sysctl(8).
vfs.zfs.l2arc_write_boost
- The value of this tunable is added to vfs.zfs.l2arc_write_max
and increases the write speed to the
SSD until the first block is evicted
from the L2ARC.
This “Turbo Warmup Phase” is designed to
reduce the performance loss from an empty L2ARC
after a reboot. This value can be adjusted at any time
with sysctl(8).
vfs.zfs.scrub_delay
- Number of ticks to delay between each I/O during a
scrub
.
To ensure that a scrub
does not
interfere with the normal operation of the pool, if any
other I/O is happening the
scrub
will delay between each command.
This value controls the limit on the total
IOPS (I/Os Per Second) generated by the
scrub
. The granularity of the setting
is determined by the value of kern.hz
which defaults to 1000 ticks per second. This setting may
be changed, resulting in a different effective
IOPS limit. The default value is
4
, resulting in a limit of:
1000 ticks/sec / 4 =
250 IOPS. Using a value of
20
would give a limit of:
1000 ticks/sec / 20 =
50 IOPS. The speed of
scrub
is only limited when there has
been recent activity on the pool, as determined by vfs.zfs.scan_idle
.
This value can be adjusted at any time with
sysctl(8).
vfs.zfs.resilver_delay
- Number of milliseconds of delay inserted between
each I/O during a
resilver. To
ensure that a resilver does not interfere with the normal
operation of the pool, if any other I/O is happening the
resilver will delay between each command. This value
controls the limit of total IOPS (I/Os
Per Second) generated by the resilver. The granularity of
the setting is determined by the value of
kern.hz
which defaults to 1000 ticks
per second. This setting may be changed, resulting in a
different effective IOPS limit. The
default value is 2, resulting in a limit of:
1000 ticks/sec / 2 =
500 IOPS. Returning the pool to
an Online state may
be more important if another device failing could
Fault the pool,
causing data loss. A value of 0 will give the resilver
operation the same priority as other operations, speeding
the healing process. The speed of resilver is only
limited when there has been other recent activity on the
pool, as determined by vfs.zfs.scan_idle
.
This value can be adjusted at any time with
sysctl(8).
vfs.zfs.scan_idle
- Number of milliseconds since the last operation before
the pool is considered idle. When the pool is idle the
rate limiting for scrub
and
resilver are
disabled. This value can be adjusted at any time with
sysctl(8).
vfs.zfs.txg.timeout
- Maximum number of seconds between
transaction groups.
The current transaction group will be written to the pool
and a fresh transaction group started if this amount of
time has elapsed since the previous transaction group. A
transaction group my be triggered earlier if enough data
is written. The default value is 5 seconds. A larger
value may improve read performance by delaying
asynchronous writes, but this may cause uneven performance
when the transaction group is written. This value can be
adjusted at any time with sysctl(8).
ZFS 所提供的部份功能需要使用大量記憶體,且可能需要對有限 RAM 的系統調校來取得最佳的效率。
最低需求,總系統記憶體應至少有 1 GB,建議的 RAM 量需視儲存池的大小以及使用的 ZFS 功能而定。一般的經驗法則是每 1 TB 的儲存空間需要 1 GB 的 RAM,若有開啟去重複的功能,一般的經驗法則是每 1 TB 的要做去重複的儲存空間需要 5 GB 的 RAM。雖然有部份使用者成功使用較少的 RAM 來運作 ZFS,但系統在負載較重時有可能會因為記憶用耗而導致當機,對於要使用低於建議 RAM 需求量來運作的系統可能會需要更進一步的調校。
由於在 i386™ 平台上位址空間的限制,在 i386™ 架構上的 ZFS 使用者必須加入這個選項到自訂核心設定檔,重新編譯核心並重新開啟:
options KVA_PAGES=512
這個選項會增加核心位址空間,允許調整 vm.kvm_size
超出目前的 1 GB 限制或在 PAE 的 2 GB 限制。要找到這個選項最合適的數值,可以將想要的位址空間換算成 MB 然後除以 4,在本例中,以 2 GB 計算後即為 512
。
在所有的 FreeBSD 架構上均可增加 kmem
位址空間,經測試在一個 1 GB 實體記憶體的測試系統上,加入以下選項到 /boot/loader.conf
,重新開啟系統,可成功設定:
vm.kmem_size="330M"
vm.kmem_size_max="330M"
vfs.zfs.arc_max="40M"
vfs.zfs.vdev.cache.size="5M"
要取得更多詳細的 ZFS 相關調校的建議清單,請參考 https://wiki.freebsd.org/ZFSTuningGuide。
本文及其他文件,可由此下載: ftp://ftp.FreeBSD.org/pub/FreeBSD/doc/。
若有 FreeBSD 方面疑問,請先閱讀
FreeBSD 相關文件,如不能解決的話,再洽詢
<questions@FreeBSD.org>。
關於本文件的問題,請洽詢
<doc@FreeBSD.org>。