Raw disk I/O performance on FreeBSD
Author: Willem Jan Withagen (wjw@withagen.nl), date: 13 october 2005
Introduction
Driven by remarks on the FreeBSD mailing lists, some information in an article
and my personal curriosity, I started to look into "evaluation" of the raw I/O
performance of disks.
Tests
Simple dd(1)
In general dd(1) is considered a (very) poor test for disk IO-performance,
but in this case it is used as a simple first approach to see what is all going
on, and to get a first indication of what to expect. So I created a simple setup
and ran the first initial tests.
Setup
On a server with 2 disks, the OS (FreeBSD 5.4) is installed on the first disk. The
second disk is not partitioned, formatted, or anything else. So it is only available
as the raw device. Then dd(1) is used to write blocks of fixed size (from /dev/zero)
to subsequent
parts of the disk. Just until this disk is filled. dd(1) is used since it reports
the transferrate all by itself, and looking at the code this would be fairly accurate.
(This removes the burden to modify dd(1), or to write a special tool.)
Results
Each graph is based on 10 runs for each of the memory block sample sizes (1,5,10,20 Mbyte).
Diskindex is the reference to the location on the disk. The disksize is divided by the
size of the write blocksize, giving the max. blockcount. The index then counts from 0
to max. blockcount.
Block Reading
Block Writing
Other disks:
BigFoot
DiamondMax9
WD2500YD
st38421
twa-single
wd800-gmirror
wd800-sata
Observations:
-
The first part of the 1Mbyte block writes is not value compatible with the picture that we
see in the other three graphs: 5, 10 20 Mbyte. The last there give more or less identical
values.
This might be due to the fact that at the faster inner part of the disk the cache
can keep up with with the transfers. This sould then be cause by the "slow" process
of starting dd(1) every time over.
Looking at the outer side of the disk, where the slow tracks are. All for runs are
at the same speed.
This would suggest that curve exposed by the 5Mbyte block writing is an indication
of the upper limit on the write transferrate on that specific part of the disk.
-
"systat -vm 1" shows a maximum of 85% diskusuage. Never gets the disk fully saturated.
Not shure if this is due to the fact that all transfers complete within the 1 second
sample time systat(1) runs on.
-
We have not (yet) accounted for the transfer-time from /dev/zero into the buffer,
before writing to the disk device. This would have the largest impact on the 1Mbyte transfers.
-
When running enough sample some sort of transfer "shadowing" starts to show. This means
that there where several transfers that were consistently less fast than the max.
transferrate. The fact that the shadowing really creates the same blocking as the main line,
just only a little slower suggests that there is some sort of deterministic process interfering
with either the transfer or the time measurement.
Time quantisation could be the cause of the very large dispersion with the 1 Mbyte blocks.
An average sample is: 1048576 bytes transferred in 0.014545 secs (72090850 bytes/sec)
or 1048576 bytes transferred in 0.021420 secs (48953123 bytes/sec).
And perhaps the time takeing is not acurate enough to really differentiate between these valaues
The fact that the 5Mbyte actually has the least and smallest diversions from the main line,
would indicate that it is not a result of time quantisation. If this where the case than this
effect would decrase when the sample duration would become longer, aka. the relative influence
of the quantisation would be four times less with 20 Mbyte blocks.
Systems
Dual Opteron 244, 1Gb Ram
Harddisk:
Western Digital WD800 SATA disk (WD800JD), 8Mb cache
On both the server and clients all processes which are not required for the
tests are terminated, especially cron(1), syslog(1) and sendmail(1).
Do it Yourself
to be filled
Interesting other reading
- NFS Tricks and Benchmarking Traps, Daniel Ellard, Margo Seltzer,
Harvard University,
Proceedings of the FREENIX Track: 2003 USENIX Annual Technical Conference
San Antonio, Texas, USA, June 9-14, 2003