In this post, we try to study how the overwrite and file_append options work in fio benchmark.

overwrite and file_append options

  • overwrite=bool

If true, writes to a file will always overwrite existing data. If the file doesn’t already exist, it will be created before the write phase begins. If the file exists and is large enough for the specified write phase, nothing will be done. Default: false.

  • file_append=bool

Perform I/O after the end of the file. Normally fio will operate within the size of a file. If this option is set, then fio will append to the file instead. This has identical behavior to setting offset to the size of a file. This option is ignored on non-regular files.

  • offset=int

Start I/O at the provided offset in the file, given as either a fixed size in bytes, zones or a percentage. If a percentage is given, the generated offset will be aligned to the minimum blocksize or to the value of offset_align if provided. Data before the given offset will not be touched. This effectively caps the file size at real_size - offset. Can be combined with size to constrain the start and end range of the I/O workload. A percentage can be specified by a number between 1 and 100 followed by ‘%’, for example, offset=20% to specify 20%. In ZBD mode, value can be set as number of zones using ‘z’.

Fio commands

The following fio commands are used to study initial write, overwrite and append write. We also have one more randread run at the last. Before each command run, we drop kernel cache with command “echo 3>/proc/sys/vm/drop_caches”.

$ fio --name=test --ioengine=libaio --blocksize=4k --readwrite=randwrite --directory=/mnt/bench1 --nrfiles=1 --size=1g --end_fsync=1 --numjobs=1 --direct=1
$ fio --name=test --ioengine=libaio --blocksize=4k --readwrite=randwrite --directory=/mnt/bench1 --nrfiles=1 --size=1g --end_fsync=1 --numjobs=1 --direct=1
$ fio --name=test --ioengine=libaio --blocksize=4k --readwrite=randwrite --directory=/mnt/bench1 --nrfiles=1 --size=1g --end_fsync=1 --numjobs=1 --direct=1 --overwrite=1
$ fio --name=test --ioengine=libaio --blocksize=4k --readwrite=randwrite --directory=/mnt/bench1 --nrfiles=1 --size=1g --end_fsync=1 --numjobs=1 --direct=1 --file_append=1
$ fio --name=test --ioengine=libaio --blocksize=4k --readwrite=randread --directory=/mnt/bench1 --nrfiles=1 --size=1g --end_fsync=1 --numjobs=1 --direct=1

Fio results

With the option “–file_append=1”, fio will append to the end of the file and expand it to the proper size. The write performance is similar to the initial write.

With the option “–overwrite=1”, fio will overwrite the existing data. If the file doesn’t already exist, it will be created before the write phase begins. If the file exists and is large enough for the specified write phase, nothing will be done.

The following table shows the fio results.
Test NameFio OptionsResultFile SizeInitial Writewrite: IOPS=7009, BW=27.4MiB/s0GB -> 1GBOverwrite–overwrite=0(default)write: IOPS=10.4k, BW=40.4MiB/s1GBOverwrite–overwrite=1write: IOPS=10.0k, BW=39.2MiB/s1GBAppend Write–file_append=1write: IOPS=6667, BW=26.0MiB/s1GB -> 2GBRandom Readread: IOPS=7769, BW=30.3MiB/s2GB

Fio syscalls for create, write and read

The file space will be allocated when to lay out the file.

29247 write(1, "Starting 1 process\n", 19) = 19
29247 stat("/mnt/bench1/test.0.0", 0x7ffd59df9db0) = -1 ENOENT (No such file or directory)
29247 write(1, "test: Laying out IO file (1 file"..., 44) = 44
29247 unlink("/mnt/bench1/test.0.0")    = -1 ENOENT (No such file or directory)
29247 open("/mnt/bench1/test.0.0", O_WRONLY|O_CREAT, 0644) = 3
29247 fallocate(3, 0, 0, 1073741824)    = 0
29247 fadvise64(3, 0, 1073741824, POSIX_FADV_DONTNEED) = 0
29247 close(3)

The following are the syscalls when fio writes data with ioengine libaio.

29429 open("/mnt/bench1/test.0.0", O_RDWR|O_CREAT|O_DIRECT, 0600) = 3
29429 fadvise64(3, 1073741824, 1073741824, POSIX_FADV_DONTNEED) = 0
29429 fadvise64(3, 1073741824, 1073741824, POSIX_FADV_RANDOM) = 0
29429 io_submit(0x7fbe9301c000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=4096, aio_offset=1138499584}]) = 1
29429 io_getevents(0x7fbe9301c000, 1, 1, [{data=0, obj=0xa77de0, res=4096, res2=0}], NULL) = 1
29429 io_submit(0x7fbe9301c000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0\340`o\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=4096, aio_offset=1868619776}]) = 1
29429 io_getevents(0x7fbe9301c000, 1, 1, [{data=0, obj=0xa77de0, res=4096, res2=0}], NULL) = 1
[..]
29429 io_submit(0x7fbe9301c000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0\300\357y\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=4096, aio_offset=2046550016}]) = 1
29429 io_getevents(0x7fbe9301c000, 1, 1, [{data=0, obj=0xa77de0, res=4096, res2=0}], NULL) = 1
29429 fsync(3)                          = 0
29429 close(3)

The following are the syscalls when fio reads data with ioengine libaio.

29484 open("/mnt/bench1/test.0.0", O_RDONLY|O_DIRECT) = 3
29484 fadvise64(3, 0, 1073741824, POSIX_FADV_DONTNEED) = 0
29484 fadvise64(3, 0, 1073741824, POSIX_FADV_RANDOM) = 0
29484 io_submit(0x7f069d23a000, 1, [{aio_lio_opcode=IOCB_CMD_PREAD, aio_fildes=3, aio_buf=0x9d1000, aio_nbytes=4096, aio_offset=64757760}]) = 1
29484 io_getevents(0x7f069d23a000, 1, 1, [{data=0, obj=0x9d3da0, res=4096, res2=0}], NULL) = 1
29484 io_submit(0x7f069d23a000, 1, [{aio_lio_opcode=IOCB_CMD_PREAD, aio_fildes=3, aio_buf=0x9d1000, aio_nbytes=4096, aio_offset=794877952}]) = 1
29484 io_getevents(0x7f069d23a000, 1, 1, [{data=0, obj=0x9d3da0, res=4096, res2=0}], NULL) = 1
[..]
29484 io_submit(0x7f069d23a000, 1, [{aio_lio_opcode=IOCB_CMD_PREAD, aio_fildes=3, aio_buf=0x9d1000, aio_nbytes=4096, aio_offset=972808192}]) = 1
29484 io_getevents(0x7f069d23a000, 1, 1, [{data=0, obj=0x9d3da0, res=4096, res2=0}], NULL) = 1
29484 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=1, tv_usec=706077}, ru_stime={tv_sec=7, tv_usec=828474}, ...}) = 0
29484 close(3)

After a terrible plane crash, Brian Robeson learns to survive alone in the wild, and in the meantime, develops manhood. The setting of the story is not clear, for Brian is very lost. Brian flies out of the airport in Hampton, New York, and is headed to the oilfields of North Canada, in order to visit his father. On his way there, the pilot gets a sudden heart attack, and bumps the steering wheel. This caused them to go off course. Brian lands the plane somewhere in the Northern Canadian woods, far away from where he was supposed to end up. The North Woods of Canada are home to a variety of animal species, including wolves, deer, black bears, otters, skunks, and many more. This setting of wildlife has a big impact on the plot. Brian is at the mercy of many dangerous animals. Some examples were when he got attacked by a porcupine, sprayed by a skunk, and charged by a moose. He must muster all the courage in him to survive in the wild. Lastly, this forces him to learn the “primitive ways of man”, like making a fire using sticks. This setting is the base of the story, for only in these conditions will he be faced with so many wild dangers.

The inciting incident of the story was when the pilot suddenly gets a fatal heart attack and dies moments later. He tries to send for help by radioing the station several times, but the connection soon breaks. It is up to Brian to save himself. Remembering the brief flying lesson he had from the lesson, he steers the plane and crash lands into a lake. He swims to shore and then faints due to exhaustion. When he wakes up, he knows that he has to find a food source and shelter. He eats gut berries(choke cherries), but gets sick since they were poisonous. He has to find a new food source and spots a raspberry bush. Now that he had the food problem covered, he had to find a suitable shelter. He decides to buy a lean-to, which is a shelter made of sticks and twigs. Once, he noticed that striking his hatchet on the stone created sparks. He tries to make a fire, and after many failures, finally succeeds. This is the turning point of his journey. Since he mastered fire, he could cook food instead of only eating berries. For a while after he masters fire, Brian has an easier time. He finds turtle eggs and starts exploring ways to hunt and fish. He had to “invent” the bow and arrow. After a while, he can easily catch fish to eat. Once, a search plane passes over him but sadly doesn’t spot Brian, which leaves him devastated. He loses his hope and will for a while, and even tried to commit suicide by cutting himself with the hachet. But soon he gets back at it. Brian perfects his bow and arrow design, and hunts fool birds and rabbits. One night, a skunk came because it smelled the hidden turtle eggs. Brian gets sprayed and almost loses his food supply. However, his optimism grows stronger with each setback, and he is determined to survive until he gets rescued.

This was until he got struck by the tornado. This is the climax of the story. Even though his shelter and food supply was all gone, Brian discovers the tail of the plane sticking out from the water after the tornado. He gets hopeful because there is a survival pack in the plane. After building a raft, he sails to the plane and uses his hatchet to get into the tail. He loses his hatchet and dives after it. He not only finds his hatchet but also the survival pack. When he returns to shore, he finds out what treasures the pack contained. It contains a .22 survival rifle, sleeping gear, pots, pans, food, matches, and an emergency transmitter. Thinking the transmitter is broken,  he flips the switches on and off a few times before tossing it away. He cooks a feast and then drifts off to sleep. Without warning, a bush plane arrives  and Brian gets rescued after spending 54 days in the woods. Brian returns to the city to live with his mother.He researches about the plants and animals he found, and sometimes dreams about his experiences in the woods. His parents never reconcile, and Bian cannot bring himself to tell his dad about the man with blond hair.

The main character of the story is Brian Robeson. He starts out as an ordinary thirteen-year old boy who lives in New York City. He lives with his mother  after his parent’s divorce. He is on his way to visit his father in the oilfields of Canada when the pilot gets a sudden heart attack and dies. The plane crashes, and Brian suddenly finds himself stranded in a desert place where there are no human beings but himself. The only tools he has with him are a hatchet and a tattered windbreaker. He is torn apart both externally and internally. His mother’s affair with another man affects Brian deeply. He bears the burden of “The Secret”, and is troubled to inform his father of it. Brian also has to face dangers from wildlife. He must learn to adapt to his new environment. It will take all his self-reliance, determination, and knowledge to survive. Brian learns to hunt game like small fish and foolbirds. He blends in with nature and appreciates its beauty. He realizes that as long as he is positive, he can accomplish his goals to achieve success. In order to live for 54 days in the woods, Brian shows the values of optimism, self-reliance, perseverance, and rationality. He arrives in the woods as a vulnerable and pitiful little boy. Brian gradually learns the power of positive thinking. Initially, Brian’s setbacks leave him frustrated, hopeless, and full of self-pity. He dreams of his past life of pleasure and ease. Things just came to him, and he never had trouble acquiring his daily needs. Now, alone in the woods, he has to fend for himself. Also, Brian recalls the words his English teacher used to say to him. Mr. Perpich was always saying, “You are your most valuable asset.” Brian realizes a little that it was up to him to improve his conditions and survive. One night, he was attacked by a porcupine and got hundreds of quills driven into his leg. For some time, Brian cries in pain and despair, but soon reemerges with a new perspective. Crying isn’t beneficial. In the city, it may arouse pity in people. But in the wild, he is alone and must bring about change. The most important rule in survival is that feeling sorry for yourself doesn’t work. This realization motivates Brian mentally. He has a general positive attitude, besides some lapses, most noticeably the suicide attempt. With each setback, he grows stronger. For example, the moose attack gravely injured him, but he persevered and kept fighting. Even though the tornado destroyed his shelter, he remained hopeful and discovered the tail of the plane, which led to his eventual rescue. I learned from Brian that in times of danger, negative values like self-pity and frustration only weakens your resolve. Dependence on others doesn’t help. If everyone relies on others to make a change, nothing will happen in society. I have to bring about a change to enhance my life.

The major theme of the story is developing manhood. At the beginning of the story, Brian thinks of himself as part of a family, so the divorce hit him with immense pain. He is no longer able to identify himself with his family. He is not ready to accept reality, which is becoming an adult and a totally separate person. The plane crash and his stay in the woods forces Brian to accept his manhood and make individual choices. He is faced with a decision which will change his whole life: To grow up and embrace challenges, or die. He accepts the challenge and all the responsibilities that come with it. He has experienced the pressures and difficulties of adulthood during his time alone in the woods, and that  is what makes him a man. When embracing challenges, Brian also develops self-worth and courage. His new found self made him realize that life is worth living, and he would never let death appeal to him again. He realizes that suicide is a way of cowardice, for he is unready to accept responsibility and challenges. He musters the courage he never knew he had, and uses it to overcome fears and difficulties. I can learn from him by persevering through challenges in life and solving problems, for that is what makes me stronger.

My favorite part of the story was when Brian found the survival pack. He was finally rewarded for his courage and self-reliance during his stay in the woods. He came so far from a mere child to a man who chose to be tough and embrace challenges. My least favorite part was when the plane missed him and flew away. He lost his will for a while and was desperate because that was the plane which was searching for him. He even tries to end the pain and suffering by cutting himself with the hatchet. Yes, I would recommend this book to a friend because it shows the importance of self-reliance and optimism. You can only bring changes to enhance your life by making rational choices, relying on only yourself, and thinking positively.

Intro to fsync, end_fsync, fdatasync and sync

  • fsync=int

If writing to a file, issue an fsync(2) (or its equivalent) of the dirty data for every number of blocks given. For example, if you give 32 as a parameter, fio will sync the file after every 32 writes issued. If fio is using non-buffered I/O, we may not sync the file. The exception is the sg I/O engine, which synchronizes the disk cache anyway. Defaults to 0, which means fio does not periodically issue and wait for a sync to complete. Also see end_fsync and fsync_on_close.

  • end_fsync=bool

If true, fsync(2) file contents when a write stage has completed. Default: false.

  • fsync_on_close=bool

If true, fio will fsync(2) a dirty file on close. This differs from end_fsync in that it will happen on every file close, not just at the end of the job. Default: false.

  • fdatasync=int

Like fsync but uses fdatasync(2) to only sync data and not metadata blocks. In Windows, DragonFlyBSD or OSX there is no fdatasync(2) so this falls back to using fsync(2). Defaults to 0, which means fio does not periodically issue and wait for a data-only sync to complete.

  • sync=str

Whether, and what type, of synchronous I/O to use for writes. The allowed values are:

  • none - Do not use synchronous IO, the default.
  • 0 - Same as none.
  • sync - Use synchronous file IO. For the majority of I/O engines, this means using O_SYNC.
  • 1 - Same as sync.
  • dsync - Use synchronous data IO. For the majority of I/O engines, this means using O_DSYNC.

Source

Create a 100MB file only

Here we only create a 100MB file which means “lay out IO file” in the fio context. There is no actual I/O happening after the file creation even though we specify the I/O related options, such as blocksize=8k.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --fsync=1 --numjobs=1 --direct=1 --group_reporting --create_only=1
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)

Run status group 0 (all jobs):

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

From the strace output, the disk space is allocated for the file.

$ cat strace.out
[..]
15801 write(1, "Starting 1 process\n", 19) = 19
15801 stat("/mnt/bench1/test.0.0", 0x7ffc477fcc20) = -1 ENOENT (No such file or directory)
15801 write(1, "test: Laying out IO file (1 file"..., 43)) = 43
15801 unlink("/mnt/bench1/test.0.0")    = -1 ENOENT (No such file or directory)
15801 open("/mnt/bench1/test.0.0", O_WRONLY|O_CREAT, 0644) = 3
15801 fallocate(3, 0, 0, 104857600)     = 0
15801 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
15801 close(3)
[..]

Create and write 100MB file with fsync=1

Here we write 100MB data to a file with 8k blocksize.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --fsync=1 --numjobs=1 --direct=1 --group_reporting
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=15988: Mon Mar 14 22:28:51 2022
  write: IOPS=6174, BW=48.2MiB/s (50.6MB/s)(100MiB/2073msec)
    slat (usec): min=29, max=163, avg=56.07, stdev=12.67
    clat (usec): min=19, max=118, avg=30.21, stdev= 7.39
     lat (usec): min=52, max=197, avg=86.72, stdev=18.43
    clat percentiles (usec):
     |  1.00th=[   21],  5.00th=[   24], 10.00th=[   25], 20.00th=[   26],
     | 30.00th=[   27], 40.00th=[   29], 50.00th=[   29], 60.00th=[   30],
     | 70.00th=[   31], 80.00th=[   32], 90.00th=[   38], 95.00th=[   47],
     | 99.00th=[   61], 99.50th=[   67], 99.90th=[   80], 99.95th=[   84],
     | 99.99th=[  102]
   bw (  KiB/s): min=45700, max=53392, per=99.44%, avg=49121.00, stdev=3324.04, samples=4
   iops        : min= 5712, max= 6674, avg=6140.00, stdev=415.68, samples=4
  lat (usec)   : 20=0.66%, 50=95.25%, 100=4.07%, 250=0.02%
  fsync/fdatasync/sync_file_range:
    sync (nsec): min=95, max=7603, avg=235.07, stdev=105.18
    sync percentiles (nsec):
     |  1.00th=[  107],  5.00th=[  165], 10.00th=[  193], 20.00th=[  211],
     | 30.00th=[  211], 40.00th=[  213], 50.00th=[  217], 60.00th=[  221],
     | 70.00th=[  225], 80.00th=[  274], 90.00th=[  338], 95.00th=[  342],
     | 99.00th=[  398], 99.50th=[  426], 99.90th=[  486], 99.95th=[  540],
     | 99.99th=[ 6880]
  cpu          : usr=7.63%, sys=29.92%, ctx=89626, majf=0, minf=13
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12800,0,12799 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=48.2MiB/s (50.6MB/s), 48.2MiB/s-48.2MiB/s (50.6MB/s-50.6MB/s), io=100MiB (105MB), run=2073-2073msec

Disk stats (read/write):
  nvme0n1: ios=0/34843, merge=0/11135, ticks=0/400, in_queue=399, util=95.01%

From the strace output, fsync is issued after each 8k block write.

$ cat strace.out
[..]
15988 open("/mnt/bench1/test.0.0", O_RDWR|O_CREAT|O_DIRECT, 0600) = 3
15988 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
15988 fadvise64(3, 0, 104857600, POSIX_FADV_SEQUENTIAL) = 0
15988 io_submit(0x7f819d8b1000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=0}]) = 1
15988 io_getevents(0x7f819d8b1000, 1, 1, [{data=0, obj=0xc88de0, res=8192, res2=0}], NULL) = 1
15988 fsync(3)                          = 0
15988 io_submit(0x7f819d8b1000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=8192}]) = 1
15988 io_getevents(0x7f819d8b1000, 1, 1, [{data=0, obj=0xc88de0, res=8192, res2=0}], NULL) = 1
15988 fsync(3)                          = 0
[..]
15988 io_submit(0x7f819d8b1000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0\340?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104849408}]) = 1
15988 io_getevents(0x7f819d8b1000, 1, 1, [{data=0, obj=0xc88de0, res=8192, res2=0}], NULL) = 1
15988 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=0, tv_usec=158715}, ru_stime={tv_sec=0, tv_usec=620431}, ...}) = 0
15988 close(3)
[..]

Create and write 100MB file with end_fsync=1

Here we write 100MB data to a file and fsync is issued after the job completes.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --end_fsync=1 --numjobs=1 --direct=1 --group_reporting
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=16648: Mon Mar 14 22:50:22 2022
  write: IOPS=9631, BW=75.2MiB/s (78.9MB/s)(100MiB/1329msec)
    slat (usec): min=42, max=131, avg=61.18, stdev=11.19
    clat (usec): min=29, max=100, avg=39.34, stdev= 6.50
     lat (usec): min=77, max=193, avg=101.11, stdev=10.93
    clat percentiles (nsec):
     |  1.00th=[32128],  5.00th=[32640], 10.00th=[33024], 20.00th=[34048],
     | 30.00th=[35072], 40.00th=[35584], 50.00th=[36608], 60.00th=[38656],
     | 70.00th=[43776], 80.00th=[44800], 90.00th=[46848], 95.00th=[50944],
     | 99.00th=[60672], 99.50th=[62720], 99.90th=[70144], 99.95th=[72192],
     | 99.99th=[80384]
   bw (  KiB/s): min=77168, max=77408, per=100.00%, avg=77288.00, stdev=169.71, samples=2
   iops        : min= 9646, max= 9676, avg=9661.00, stdev=21.21, samples=2
  lat (usec)   : 50=94.53%, 100=5.46%, 250=0.01%
  cpu          : usr=7.91%, sys=42.70%, ctx=51220, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12800,0,1 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=75.2MiB/s (78.9MB/s), 75.2MiB/s-75.2MiB/s (78.9MB/s-78.9MB/s), io=100MiB (105MB), run=1329-1329msec

Disk stats (read/write):
  nvme0n1: ios=0/12803, merge=0/5, ticks=0/203, in_queue=203, util=91.40%

From the strace output, the fsync is issued at the end of fio job.

$ cat strace.out
[..]
16648 open("/mnt/bench1/test.0.0", O_RDWR|O_CREAT|O_DIRECT, 0600) = 3
16648 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
16648 fadvise64(3, 0, 104857600, POSIX_FADV_SEQUENTIAL) = 0
16648 io_submit(0x7fac95355000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=0}]) = 1
16648 io_getevents(0x7fac95355000, 1, 1, [{data=0, obj=0xa37de0, res=8192, res2=0}], NULL) = 1
16648 io_submit(0x7fac95355000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=8192}]) = 1
16648 io_getevents(0x7fac95355000, 1, 1, [{data=0, obj=0xa37de0, res=8192, res2=0}], NULL) = 1
[..]
16648 io_submit(0x7fac95355000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0\340?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104849408}]) = 1
16648 io_getevents(0x7fac95355000, 1, 1, [{data=0, obj=0xa37de0, res=8192, res2=0}], NULL) = 1
16648 fsync(3)                          = 0
16648 close(3)                          = 0
[..]

Create and write 100MB file with fdatasync=1

Here we write 100MB data to a file with 8k blocksize. fdatasync is issued after each block write.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --fdatasync=1 --numjobs=1 --direct=1 --group_reporting
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=23011: Tue Mar 15 02:45:48 2022
  write: IOPS=4857, BW=37.0MiB/s (39.8MB/s)(100MiB/2635msec)
    slat (usec): min=30, max=183, avg=70.02, stdev=12.85
    clat (usec): min=20, max=280, avg=45.46, stdev= 9.98
     lat (usec): min=54, max=348, avg=116.11, stdev=20.49
    clat percentiles (usec):
     |  1.00th=[   25],  5.00th=[   26], 10.00th=[   31], 20.00th=[   41],
     | 30.00th=[   45], 40.00th=[   46], 50.00th=[   48], 60.00th=[   49],
     | 70.00th=[   49], 80.00th=[   50], 90.00th=[   52], 95.00th=[   58],
     | 99.00th=[   75], 99.50th=[   84], 99.90th=[  137], 99.95th=[  143],
     | 99.99th=[  172]
   bw (  KiB/s): min=37168, max=43632, per=100.00%, avg=38934.40, stdev=2648.49, samples=5
   iops        : min= 4646, max= 5454, avg=4866.80, stdev=331.06, samples=5
  lat (usec)   : 50=86.37%, 100=13.41%, 250=0.21%, 500=0.01%
  fsync/fdatasync/sync_file_range:
    sync (nsec): min=107, max=11447, avg=232.89, stdev=147.40
    sync percentiles (nsec):
     |  1.00th=[  127],  5.00th=[  161], 10.00th=[  185], 20.00th=[  211],
     | 30.00th=[  221], 40.00th=[  227], 50.00th=[  231], 60.00th=[  241],
     | 70.00th=[  247], 80.00th=[  258], 90.00th=[  266], 95.00th=[  282],
     | 99.00th=[  334], 99.50th=[  350], 99.90th=[  516], 99.95th=[  620],
     | 99.99th=[ 9408]
  cpu          : usr=6.19%, sys=31.13%, ctx=89613, majf=0, minf=13
  IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12800,0,0 short=12799,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=37.0MiB/s (39.8MB/s), 37.0MiB/s-37.0MiB/s (39.8MB/s-39.8MB/s), io=100MiB (105MB), run=2635-2635msec

Disk stats (read/write):
  nvme0n1: ios=0/35262, merge=0/11272, ticks=0/416, in_queue=416, util=96.06%

From the strace output, fdatasync is issued after each 8k block write.

$ cat strace.out
[..]
23011 open("/mnt/bench1/test.0.0", O_RDWR|O_CREAT|O_DIRECT, 0600) = 3
23011 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
23011 fadvise64(3, 0, 104857600, POSIX_FADV_SEQUENTIAL) = 0
23011 io_submit(0x7f1a71a81000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=0}]) = 1
23011 io_getevents(0x7f1a71a81000, 1, 1, [{data=0, obj=0x1207de0, res=8192, res2=0}], NULL) = 1
23011 fdatasync(3)
[..]
23011 io_submit(0x7f1a71a81000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0 ?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104841216}]) = 1
23011 io_getevents(0x7f1a71a81000, 1, 1, [{data=0, obj=0x1207de0, res=8192, res2=0}], NULL) = 1
23011 fdatasync(3)                      = 0
23011 io_submit(0x7f1a71a81000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0 ?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104849408}]) = 1
23011 io_getevents(0x7f1a71a81000, 1, 1, [{data=0, obj=0x1207de0, res=8192, res2=0}], NULL) = 1
23011 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=0, tv_usec=164077}, ru_stime={tv_sec=0, tv_usec=820386}, ...}) = 0
23011 close(3)
[..]

Create and write 100MB file with sync=1

Here we write 100MB data, by doing 8k sequential write with 1 job. The I/O is synchronous.

$ strace -f -o strace.out fio --name=test --ioengine=libaio --blocksize=8k --readwrite=write --directory=/mnt/bench1 --nrfiles=1 --filesize=100m --sync=1 --numjobs=1 --direct=1 --group_reporting
test: (g=0): rw=write, bs=(R) 8192B-8192B, (W) 8192B-8192B, (T) 8192B-8192B, ioengine=libaio, iodepth=1
fio-3.7
Starting 1 process
test: Laying out IO file (1 file / 100MiB)
Jobs: 1 (f=1)
test: (groupid=0, jobs=1): err= 0: pid=16435: Mon Mar 14 22:43:59 2022
  write: IOPS=9377, BW=73.3MiB/s (76.8MB/s)(100MiB/1365msec)
    slat (usec): min=28, max=146, avg=47.45, stdev= 7.00
    clat (usec): min=33, max=581, avg=57.00, stdev=11.75
     lat (usec): min=79, max=619, avg=104.88, stdev=14.94
    clat percentiles (usec):
     |  1.00th=[   43],  5.00th=[   45], 10.00th=[   48], 20.00th=[   51],
     | 30.00th=[   52], 40.00th=[   53], 50.00th=[   57], 60.00th=[   59],
     | 70.00th=[   60], 80.00th=[   62], 90.00th=[   67], 95.00th=[   76],
     | 99.00th=[   95], 99.50th=[   99], 99.90th=[  113], 99.95th=[  131],
     | 99.99th=[  578]
   bw (  KiB/s): min=70896, max=77461, per=98.88%, avg=74178.50, stdev=4642.16, samples=2
   iops        : min= 8862, max= 9682, avg=9272.00, stdev=579.83, samples=2
  lat (usec)   : 50=18.82%, 100=80.74%, 250=0.42%, 750=0.02%
  cpu          : usr=4.62%, sys=28.52%, ctx=63963, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,12800,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
  WRITE: bw=73.3MiB/s (76.8MB/s), 73.3MiB/s-73.3MiB/s (76.8MB/s-76.8MB/s), io=100MiB (105MB), run=1365-1365msec

Disk stats (read/write):
  nvme0n1: ios=0/32201, merge=0/10294, ticks=0/356, in_queue=356, util=91.92%

From the strace output, the file is opened with “O_SYNC” flag since we specified “–sync=1” in the fio command. This means all the incoming writes are synchronous.

$ cat strace.out
[..]
16435 open("/mnt/bench1/test.0.0", O_RDWR|O_CREAT|O_SYNC|O_DIRECT, 0600) = 3
16435 fadvise64(3, 0, 104857600, POSIX_FADV_DONTNEED) = 0
16435 fadvise64(3, 0, 104857600, POSIX_FADV_SEQUENTIAL) = 0
16435 io_submit(0x7fe4c1fe3000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=0}]) = 1
16435 io_getevents(0x7fe4c1fe3000, 1, 1, [{data=0, obj=0x1980de0, res=8192, res2=0}], NULL) = 1
16435 io_submit(0x7fe4c1fe3000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0 \0\0\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=8192}]) = 1
16435 io_getevents(0x7fe4c1fe3000, 1, 1, [{data=0, obj=0x1980de0, res=8192, res2=0}], NULL) = 1
[..]
16435 io_submit(0x7fe4c1fe3000, 1, [{aio_lio_opcode=IOCB_CMD_PWRITE, aio_fildes=3, aio_buf="\0\200?\6\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., aio_nbytes=8192, aio_offset=104849408}]) = 1
16435 io_getevents(0x7fe4c1fe3000, 1, 1, [{data=0, obj=0x1980de0, res=8192, res2=0}], NULL) = 1
16435 getrusage(RUSAGE_THREAD, {ru_utime={tv_sec=0, tv_usec=64153}, ru_stime={tv_sec=0, tv_usec=389856}, ...}) = 0
16435 close(3)
[..]

Start from single server

Building a large scale system is not one time effort, it should be an iterative process as the user workload increases. The journey could just start with single server and single user request.

Image

  1. User access a website through domain name, such as example.com
  2. IP address is returned to the browser or the mobile app.
  3. HTTP requests are sent to the web server.
  4. The web server returns the requested resources, such as a HTML page or JSON response.

Database

With the growth of the user base, we can separate the web server and database server to allow them scale independently.

Image

SQL vs. NoSQL

We can choose between relational database and non-relational database.

Relational database is also called relational database management system(RDBMS) or SQL database. The popular relational databases are MySQL, Oracle database, MS-SQL database, PostgreSQL and so on. Relational database stores data in tables and rows. Join operations can be performed across different database tables.

Non-relational database is also called NoSQL database. The popular ones are MongoDB, CouchDB, Cassandra, HBase, Neo4j, Amazon DynamoDB and so on. They can be grouped into four categories: key-value stores, graph stores, column stores, and document stores. Generally, join operations are not supported in non-relational database.

Non-relational database might be the choice if:

  • very low latency is required
  • data is unstructured or not relational
  • the data needs to be serialized and deserialied(JSON, XML, YAML, etc)
  • massive amount of data needs to be stored

Vertical scaling vs. horizontal scaling

Vertical scaling, also called “scale up”, means adding more compute power(CPU/Memory), disk bandwidth, and network bandwidth to the servers. Vertical scaling is simple but it’s impossible to add unlimited resource to a single server. It doesn’t provide failover and redundancy and it may cause single point of failure.

Horizontal scaling, also called “scale out”, means adding more servers to the resouce pool. It’s suitable for large scale applications.

Load balancer

With the single server setup, users won’t be able to access the website if the web server goes offline. Users may experience slow response(high latency) or failed connection if there are too many user requests to the server simultaneously.

A load balancer can help evenly distribute incoming traffic among web servers.

Image

Users connect to the public IP of the load balancer. The web servers sitting behind the load balancer are unreachable directly by the users anymore. The private IPs can be used for communications between load balancer and web servers.

After the load balancer and more web servers are added, we improved the high availability of the web servers. For example, if web server 1 goes offline, the traffic can be routed to other web servers which are still healthy. As the website traffic grows rapidly, we can always add more web servers to scale and handle the incoming traffic.

Database replication

Now that we can scale the web tier as needed, how about the data tier? Database replication is a common technique to address the problems like failover and redundancy.

“Database replication can be used on many database management systems (DBMS), usually with a primary/replica relationship between the original and the copies.” Source

A master database generally supports write operations only. A slave database replicates data from master and only supports read operations.

Image

The master-slave model may not be the only method. We just use this model to learn about how to scale the database tier. We can achieve much better performance by spearating the write and read operations to master and slave databases. It also provides the data reliability in the case of natural disaster like earthquake. The data is highly available even if a database is offline.

  • If the master database goes offline, a slave database will be promoted as the new master. A new slave database will replace the old one for data replication. Promoting a master is not that simple process because the data in slave database may not be up to date. The missing data have to be recovered by some recovery mechanism.
  • If the slave databases are go offline, read operations can be directed to master database temporarily until a new slave database is in place.

Cache

After we have the design of scalable web and data tier, it’s time to improve the system performance like response time. Adding a cache layer can help serve the user requests more quickly.

  1. If data exists in cache, read data from cache
  2. If data doesn’t exist in cache, fetch data from database and save data to cache. Then the cached data are returned to web servers. This is so called read-through cache.

Image

Here are some considerations for using a cache system:

  • Consider using cache when the data is read frequently but write(modification) infrequently
  • Use an expiration policy to reload the data from data store
  • Use an eviction policy to add data to the already full cache. Least-recently-used(LRU) or First in First Out(FIFO)?
  • Maintain the consistency between data store and cache. MemCache solution?
  • Avoid single point of failure of a cache system

Content Delivery Network(CDN)

A CDN is a network of geographially dispersed servers used to deliver static content. CDN servers cache static content like images, videos, CSS, Javascript files, etc.

When a user visits a website, a CDN server closest to the user will deliver the static content. The website loads faster from a closer CDN server.

Image

  1. If the content to be loaded is not in CDN, get it from the origin server and store it in CDN server.
  2. If the content is in CDN, return it from CDN.

Image

Stateless web tier

In order to scale the web tier horizontally, we need to move the state(e.g. user session data) out of the web tier. A practice is to store the session data in the persistent storage such as SQL or NoSQL database. The web servers in the cluster can access the state data from the database. This is called stateless web tier.

In the stateless architecture, the HTTP requests from many users can be sent to any web servers. The session data will be fetched from a shared data store which is separated from web servers.

Image

After the session data is stored out of web servers, auto-scaling of the web tier becomes a lot more easier.

Image

Data centers

As the website access grows internationally, supporting multiple data centers globally is crucial. In the event of any data center outage, the traffic can be directed to a healthy data center as well.

Image

Message queue

To further scale the system, we can decouple the system components so that they can be scaled independently.

A message queue is a durable component, stored in memory, that supports asynchronous communication. It serves as a buffer and distributes asynchronous requests. Typically, a message queue has an input service, also called producers/publishers, to create message, and publish them to the message queue. The consumer services connect to the queue and process the messages asynchronously.

Image

Logging, monitoring, automation

As the system scale for a very large business, it’s necessary to invest in the logging, monitoring and automation. These tools can help troubleshoot issues, understand system insight and system health, and improve productivity.

Database scaling

Vertical scaling and horizontal scaling are the two approaches for database scaling.

Vertical scaling, also known as scaling up, is to scale by adding more hardware components(CPU, Memory, Disk, Network) to the existing server. But it has a hardware limit and the cost can be high. Also there is risk of single point of failures.

Horizontal scaling, also known as sharding, is a practice of adding more database servers. Sharding separates large databases to smaller ones which are called shards. A hash function(e.g. user_id % n) can be used to determine which shard should be accessed. In this example of hash function, the user_id is the sharding key which is an important factor to consider when to implement a sharding strategy. The goal is to choose a key which can evenly distribute the data across shards.

With database sharding, it may introduce complexities and new challenges to the system.

  • resharding data
  • hotspot key
  • join operations across database shards

Conclusion

Scaling a system is an iterative process. As the user requests increase and system scales, more fine-tuning and new strategies are needed. In summary, we can scale the system to support millions of users by addressing the following areas.

  • Keep web tier stateless by storing session data out of web servers
  • Have redundancy for each tier
  • Cache data as much as possible to improve response time
  • Build multiple data centers
  • Use CDN to serve static content
  • Sharding the data tier if needed
  • Decouple the components to increase throughput(e.g. message queue)
  • Add tools to improve ease of use

Go does not have set by default but there is way to imitate it in Go.

The following is an example code piece to create “set” in Go by setting a map with empty values.

$ cat test.go
package main

import (
    "fmt"
)

// Main function
func main() {
    set := make(map[string]struct{})
    set["cat"] = struct{}{}
    set["dog"] = struct{}{}
    set["rabbit"] = struct{}{}

    if _, ok := set["rabbit"]; ok {
        fmt.Println("rabbit exists")
    }

    delete(set,"rabbit")

    if _, ok := set["rabbit"]; ok {
        fmt.Println("exist")
    }else{
        fmt.Println("rabbit doesn't exist")
    }
}

$ go run test.go
rabbit exists
rabbit doesn't exist

Books

  • The Art of Computer Programming

  • Understanding the Linux Kernel, 3rd Edition

  • Linux Device Drivers, 3rd Edition

  • Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems, 1st Edition

  • Building Microservices: Designing Fine-Grained Systems 1st Edition

  • Site Reliability Engineering: How Google Runs Production Systems 1st Editio

  • Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Addison-Wesley Signature Series (Fowler)) 1st Edition

  • Big Data: Principles and best practices of scalable realtime data systems

  • NoSQL Distilled: A Brief Guide to the Emerging World of Polyglot Persistence

  • Professional NoSQL 1st Edition

  • Kubernetes: Up and Running: Dive into the Future of Infrastructure 2nd Edition

  • Kubernetes in Action 1st Edition

  • Systems Performance, 2nd Edition

  • BPF Performance Tools

  • Understanding Software Dynamics

Papers

Blogs

Articles

Benchmark Tools

Others

Specify file system operations in FWD and RD

In the File system Workload Definition(FWD), we can specify a single file sysetm operation that must be executed for the workload, with the parameter ‘operation=’. The valid operation can be mkdir,rmdir,create,delete,open,close,read,write,getattr and setattr. We can specify ‘fwd=xxx,rdpct=nn’ to allow for mixed read and write operations against the same file.

If one or more operations need to be specified, the parameter ‘fwd=xxx,operations=’ in Run Definition(RD) can be leveraged to override the single operation in FWD. For example, the parameter ‘operations=mkdir’ or ‘operations=(read,getattr)’ can be specified in RD. Vdbench will fail the operation if the file structure is not created before it is accessed.

In the following example, we will run vdbench to make directories, create files, getattr, delete files and remove directories.

Make directories

Prepare the job file to make the directories:

$ cat jobfile/vdb_mkdir.job
hd=default,vdbench=/home/tester/vdbench_test,shell=ssh,user=root
hd=host1,jvms=1,system=192.168.1.2

fsd=fsd1,anchor=/mnt/bench1,depth=5,width=5,files=2,size=4k
fwd=fwd1,fsd=fsd1,host=host1,operation=mkdir
rd=rd1,fwd=fwd1,format=no,fwdrate=max

Run vdbench job to make the directories:

$ ./vdbench -f jobfile/vdb_mkdir.job
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Vdbench distribution: vdbench50407 Tue June 05  9:49:29 MDT 2018
For documentation, see 'vdbench.pdf'.

02:06:05.755 input argument scanned: '-fjobfile/vdb_mkdir.job'
02:06:05.833 Anchor size: anchor=/mnt/bench1: dirs:        3,905; files:        6,250; bytes:    24.414m (25,600,000)
02:06:06.097 Starting slave: ssh 192.168.1.2 -l root /home/tester/vdbench_test/vdbench SlaveJvm -m 192.168.1.3 -n 192.168.1.2-10-220316-02.06.05.716 -l host1-0 -p 5570
02:06:06.527 All slaves are now connected
02:06:08.002 Starting RD=rd1; elapsed=30; fwdrate=max. For loops: None

Mar 16, 2022 ..Interval.. .ReqstdOps... ...cpu%...  read ....read..... ....write.... ..mb/sec... mb/sec .xfer.. ...mkdir.... ....open.... ...close....
                            rate   resp total  sys   pct   rate   resp   rate   resp  read write  total    size  rate   resp  rate   resp  rate   resp
02:06:09.065            1 3905.0  0.018   3.7 0.41   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0  3905  0.018   0.0  0.000   0.0  0.000
02:06:09.102      avg_2-1    NaN  0.000   NaN  NaN   0.0    NaN  0.000    NaN  0.000   NaN   NaN    NaN       0   NaN  0.000   NaN  0.000   NaN  0.000
02:06:09.102      std_2-1
02:06:09.103      max_2-1
02:06:09.216
02:06:09.216 Miscellaneous statistics:
02:06:09.216 (These statistics do not include activity between the last reported interval and shutdown.)
02:06:09.216 DIRECTORY_CREATES   Directories created:                          3,905      3,905/sec
02:06:09.216
02:06:09.932 Vdbench execution completed successfully. Output directory: /home/tester/vdbench_test/output

$ ls -la /mnt/bench1/vdb.1_1.dir/vdb.2_1.dir/vdb.3_1.dir/vdb.4_1.dir/vdb.5_1.dir/
total 8
drwxr-xr-x 2 root root 4096 Mar 16 02:06 .
drwxr-xr-x 7 root root 4096 Mar 16 02:06 ..

Create files in the existing directories

Prepare the job file to create files in the existing directories:

$ cat jobfile/vdb_create.job
hd=default,vdbench=/home/tester/vdbench_test,shell=ssh,user=root
hd=host1,jvms=1,system=192.168.1.2

fsd=fsd1,anchor=/mnt/bench1,depth=5,width=5,files=2,size=4k
fwd=fwd1,fsd=fsd1,host=host1,operation=create
rd=rd1,fwd=fwd1,format=restart,fwdrate=max

Note that we specify the parameter ‘format=restart’. It will create files that have not been created and will also expand files that have not reached the proper file size.

Run vdbench job to create files:

$ ./vdbench -f jobfile/vdb_create.job
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Vdbench distribution: vdbench50407 Tue June 05  9:49:29 MDT 2018
For documentation, see 'vdbench.pdf'.

02:06:11.387 input argument scanned: '-fjobfile/vdb_create.job'
02:06:11.468 Anchor size: anchor=/mnt/bench1: dirs:        3,905; files:        6,250; bytes:    24.414m (25,600,000)
02:06:11.719 Starting slave: ssh 192.168.1.2 -l root /home/tester/vdbench_test/vdbench SlaveJvm -m 192.168.1.3 -n 192.168.1.2-10-220316-02.06.11.352 -l host1-0 -p 5570
02:06:12.148 All slaves are now connected
02:06:13.002 Starting RD=format_for_rd1
02:06:13.238 host1-0: anchor=/mnt/bench1 mkdir complete.
02:06:13.418 host1-0: anchor=/mnt/bench1 create complete.

Mar 16, 2022 ..Interval.. .ReqstdOps... ...cpu%...  read ....read..... ....write.... ..mb/sec... mb/sec .xfer.. ...mkdir.... ...rmdir.... ...create... ....open.... ...close.... ...delete...
                            rate   resp total  sys   pct   rate   resp   rate   resp  read write  total    size  rate   resp  rate   resp  rate   resp  rate   resp  rate   resp  rate   resp
02:06:14.063            1 6250.0  0.019  12.7 0.99   0.0    0.0  0.000 6250.0  0.019  0.00 24.41  24.41    4096   0.0  0.000   0.0  0.000  6250  0.219  6250  0.071  6250  0.006   0.0  0.000
02:06:14.103      avg_2-1    NaN  0.000   NaN  NaN   0.0    NaN  0.000    NaN  0.000   NaN   NaN    NaN       0   NaN  0.000   NaN  0.000   NaN  0.000   NaN  0.000   NaN  0.000   NaN  0.000
02:06:14.103      std_2-1
02:06:14.103      max_2-1
02:06:14.217
02:06:14.217 Miscellaneous statistics:
02:06:14.217 (These statistics do not include activity between the last reported interval and shutdown.)
02:06:14.218 FILE_CREATES        Files created:                                6,250      6,250/sec
02:06:14.218 WRITE_OPENS         Files opened for write activity:              6,250      6,250/sec
02:06:14.218 DIR_EXISTS          Directory may not exist (yet):                    7          7/sec
02:06:14.218 FILE_CLOSES         Close requests:                               6,250      6,250/sec
02:06:14.218
02:06:15.001 Starting RD=rd1; elapsed=30; fwdrate=max. For loops: None
02:06:15.344
02:06:15.344 Message from slave host1-0:
02:06:15.345 Anchor: /mnt/bench1
02:06:15.345 Vdbench is trying to create a new file, but all files already exist,
02:06:15.345 and no threads are currently active deleting files
02:06:15.345 FwgThread.canWeGetMoreFiles(): Shutting down threads for operation=create
02:06:15.345

Mar 16, 2022 ..Interval.. .ReqstdOps... ...cpu%...  read ....read..... ....write.... ..mb/sec... mb/sec .xfer.. ...mkdir.... ...rmdir.... ...create... ....open.... ...close.... ...delete...
                            rate   resp total  sys   pct   rate   resp   rate   resp  read write  total    size  rate   resp  rate   resp  rate   resp  rate   resp  rate   resp  rate   resp
02:06:16.021            1    0.0  0.000   1.8 0.09   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000   0.0  0.000   0.0  0.000   0.0  0.000   0.0  0.000
02:06:16.030      avg_2-1    NaN  0.000   NaN  NaN   0.0    NaN  0.000    NaN  0.000   NaN   NaN    NaN       0   NaN  0.000   NaN  0.000   NaN  0.000   NaN  0.000   NaN  0.000   NaN  0.000
02:06:16.030      std_2-1
02:06:16.030      max_2-1
02:06:16.135 Miscellaneous statistics: All counters are zero
02:06:16.602 Vdbench execution completed successfully. Output directory: /home/tester/vdbench_test/output

$ ls -la /mnt/bench1/vdb.1_1.dir/vdb.2_1.dir/vdb.3_1.dir/vdb.4_1.dir/vdb.5_1.dir/
total 16
drwxr-xr-x 2 root root 4096 Mar 16 02:06 .
drwxr-xr-x 7 root root 4096 Mar 16 02:06 ..
-rw-r--r-- 1 root root 4096 Mar 16 02:06 vdb_f0000.file
-rw-r--r-- 1 root root 4096 Mar 16 02:06 vdb_f0001.file

Getattr

Prepare the job file to getattr:

$ cat jobfile/vdb_getattr.job
hd=default,vdbench=/home/tester/vdbench_test,shell=ssh,user=root
hd=host1,jvms=1,system=192.168.1.2

fsd=fsd1,anchor=/mnt/bench1,depth=5,width=5,files=2,size=4k
fwd=fwd1,fsd=fsd1,host=host1,operation=getattr
rd=rd1,fwd=fwd1,format=no,fwdrate=max

Run vdbench job to getattr:

$ ./vdbench -f jobfile/vdb_getattr.job
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Vdbench distribution: vdbench50407 Tue June 05  9:49:29 MDT 2018
For documentation, see 'vdbench.pdf'.

02:06:18.059 input argument scanned: '-fjobfile/vdb_getattr.job'
02:06:18.145 Anchor size: anchor=/mnt/bench1: dirs:        3,905; files:        6,250; bytes:    24.414m (25,600,000)
02:06:18.408 Starting slave: ssh 192.168.1.2 -l root /home/tester/vdbench_test/vdbench SlaveJvm -m 192.168.1.3 -n 192.168.1.2-10-220316-02.06.18.020 -l host1-0 -p 5570
02:06:18.778 All slaves are now connected
02:06:20.002 Starting RD=rd1; elapsed=30; fwdrate=max. For loops: None

Mar 16, 2022 ..Interval.. .ReqstdOps... ...cpu%...  read ....read..... ....write.... ..mb/sec... mb/sec .xfer.. ....open.... ...close.... ..getattr...
                            rate   resp total  sys   pct   rate   resp   rate   resp  read write  total    size  rate   resp  rate   resp  rate   resp
02:06:21.062            1  39541  0.015   8.1 0.84   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 39541  0.015
02:06:22.032            2  77608  0.010   5.3 1.47   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 77608  0.010
02:06:23.016            3  83661  0.009   4.5 1.26   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 83661  0.009
02:06:24.014            4  83054  0.009   4.4 1.35   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 83054  0.009
02:06:25.014            5  83753  0.009   4.3 1.26   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 83753  0.009
02:06:26.012            6  83907  0.009   4.3 1.23   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 83907  0.009
02:06:27.012            7  83520  0.009   4.3 1.26   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 83520  0.009
02:06:28.011            8  82677  0.009   4.4 1.19   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82677  0.009
02:06:29.016            9  83093  0.009   4.6 1.19   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 83093  0.009
02:06:30.012           10  84090  0.009   4.5 1.23   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 84090  0.009
02:06:31.011           11  83013  0.009   4.2 1.16   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 83013  0.009
02:06:32.013           12  83158  0.010   4.3 1.38   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 83158  0.010
02:06:33.010           13  74121  0.011   4.3 1.19   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 74121  0.011
02:06:34.009           14  77405  0.010   4.4 1.35   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 77405  0.010
02:06:35.011           15  82874  0.010   4.6 1.28   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82874  0.010
02:06:36.011           16  82855  0.010   4.6 1.34   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82855  0.010
02:06:37.010           17  82508  0.010   4.6 1.38   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82508  0.010
02:06:38.007           18  82766  0.010   4.6 1.38   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82766  0.010
02:06:39.009           19  82048  0.010   4.3 1.22   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82048  0.010
02:06:40.010           20  81357  0.010   4.4 1.25   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 81357  0.010
02:06:41.008           21  82723  0.010   4.4 1.28   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82723  0.010
02:06:42.007           22  82702  0.010   4.5 1.50   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82702  0.010
02:06:43.008           23  82793  0.010   4.3 1.41   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82793  0.010
02:06:44.009           24  82465  0.010   4.4 1.28   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82465  0.010
02:06:45.009           25  82695  0.010   4.4 1.32   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82695  0.010
02:06:46.008           26  82119  0.010   4.6 1.47   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82119  0.010
02:06:47.008           27  82297  0.010   4.8 1.72   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82297  0.010
02:06:48.007           28  82508  0.010   4.4 1.35   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82508  0.010
02:06:49.008           29  82545  0.010   4.4 1.41   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82545  0.010
02:06:50.007           30  82641  0.010   4.7 1.44   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82641  0.010
02:06:50.015     avg_2-30  82170  0.010   4.5 1.33   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000 82170  0.010
02:06:50.015     std_2-30 2147.1  0.018                                                                                                    2147  0.018
02:06:50.016     max_2-30  84090  5.854                                                                                                   84090  5.854
02:06:50.234
02:06:50.234 Miscellaneous statistics:
02:06:50.234 (These statistics do not include activity between the last reported interval and shutdown.)
02:06:50.234 GET_ATTR            Getattr requests:                         2,422,766     80,758/sec
02:06:50.234
02:06:51.201 Vdbench execution completed successfully. Output directory: /home/tester/vdbench_test/output

$ ls -la /mnt/bench1/vdb.1_1.dir/vdb.2_1.dir/vdb.3_1.dir/vdb.4_1.dir/vdb.5_1.dir/
total 16
drwxr-xr-x 2 root root 4096 Mar 16 02:06 .
drwxr-xr-x 7 root root 4096 Mar 16 02:06 ..
-rw-r--r-- 1 root root 4096 Mar 16 02:06 vdb_f0000.file
-rw-r--r-- 1 root root 4096 Mar 16 02:06 vdb_f0001.file

Delete files

Prepare the job file to delete files:

$ cat jobfile/vdb_delete.job
hd=default,vdbench=/home/tester/vdbench_test,shell=ssh,user=root
hd=host1,jvms=1,system=192.168.1.2

fsd=fsd1,anchor=/mnt/bench1,depth=5,width=5,files=2,size=4k
fwd=fwd1,fsd=fsd1,host=host1,operation=delete
rd=rd1,fwd=fwd1,format=no,fwdrate=max

Run vdbench job to delete files:

$ ./vdbench -f jobfile/vdb_delete.job
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Vdbench distribution: vdbench50407 Tue June 05  9:49:29 MDT 2018
For documentation, see 'vdbench.pdf'.

02:06:52.656 input argument scanned: '-fjobfile/vdb_delete.job'
02:06:52.757 Anchor size: anchor=/mnt/bench1: dirs:        3,905; files:        6,250; bytes:    24.414m (25,600,000)
02:06:53.060 Starting slave: ssh 192.168.1.2 -l root /home/tester/vdbench_test/vdbench SlaveJvm -m 192.168.1.3 -n 192.168.1.2-10-220316-02.06.52.608 -l host1-0 -p 5570
02:06:53.542 All slaves are now connected
02:06:55.002 Starting RD=rd1; elapsed=30; fwdrate=max. For loops: None
02:06:55.403
02:06:55.403 Message from slave host1-0:
02:06:55.403 Anchor: /mnt/bench1
02:06:55.403 Vdbench is trying to delete a file, but no files are available, and no
02:06:55.404 threads are currently active creating new files.
02:06:55.404 FwgThread.canWeGetMoreFiles(): Shutting down threads for operation=delete
02:06:55.404

Mar 16, 2022 ..Interval.. .ReqstdOps... ...cpu%...  read ....read..... ....write.... ..mb/sec... mb/sec .xfer.. ....open.... ...close.... ...delete...
                            rate   resp total  sys   pct   rate   resp   rate   resp  read write  total    size  rate   resp  rate   resp  rate   resp
02:06:56.061            1 6250.0  0.044   7.3 0.61   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0   0.0  0.000   0.0  0.000  6250  0.044
02:06:56.096      avg_2-1    NaN  0.000   NaN  NaN   0.0    NaN  0.000    NaN  0.000   NaN   NaN    NaN       0   NaN  0.000   NaN  0.000   NaN  0.000
02:06:56.096      std_2-1
02:06:56.097      max_2-1
02:06:56.210
02:06:56.210 Miscellaneous statistics:
02:06:56.210 (These statistics do not include activity between the last reported interval and shutdown.)
02:06:56.210 FILE_DELETES        Files deleted:                                6,250      6,250/sec
02:06:56.211 FILE_MUST_EXIST     File does not exist (yet):                        1          1/sec
02:06:56.211
02:06:56.949 Vdbench execution completed successfully. Output directory: /home/tester/vdbench_test/output

$ ls -la /mnt/bench1/vdb.1_1.dir/vdb.2_1.dir/vdb.3_1.dir/vdb.4_1.dir/vdb.5_1.dir/
total 8
drwxr-xr-x 2 root root 4096 Mar 16 02:06 .
drwxr-xr-x 7 root root 4096 Mar 16 02:06 ..

Remove directories

Prepare the job file to remove directories:

$ cat jobfile/vdb_rmdir.job
hd=default,vdbench=/home/tester/vdbench_test,shell=ssh,user=root
hd=host1,jvms=1,system=192.168.1.2

fsd=fsd1,anchor=/mnt/bench1,depth=5,width=5,files=2,size=4k
fwd=fwd1,fsd=fsd1,host=host1,operation=rmdir
rd=rd1,fwd=fwd1,format=no,fwdrate=max

Run vdbench job to remove directories:

$ ./vdbench -f jobfile/vdb_rmdir.job
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Vdbench distribution: vdbench50407 Tue June 05  9:49:29 MDT 2018
For documentation, see 'vdbench.pdf'.

02:06:58.406 input argument scanned: '-fjobfile/vdb_rmdir.job'
02:06:58.484 Anchor size: anchor=/mnt/bench1: dirs:        3,905; files:        6,250; bytes:    24.414m (25,600,000)
02:06:58.736 Starting slave: ssh 192.168.1.2 -l root /home/tester/vdbench_test/vdbench SlaveJvm -m 192.168.1.3 -n 192.168.1.2-10-220316-02.06.58.369 -l host1-0 -p 5570
02:06:59.158 All slaves are now connected
02:07:00.002 Starting RD=rd1; elapsed=30; fwdrate=max. For loops: None

Mar 16, 2022 ..Interval.. .ReqstdOps... ...cpu%...  read ....read..... ....write.... ..mb/sec... mb/sec .xfer.. ...rmdir.... ....open.... ...close....
                            rate   resp total  sys   pct   rate   resp   rate   resp  read write  total    size  rate   resp  rate   resp  rate   resp
02:07:01.051            1 3905.0  0.039   5.4 0.65   0.0    0.0  0.000    0.0  0.000  0.00  0.00   0.00       0  3905  0.039   0.0  0.000   0.0  0.000
02:07:01.089      avg_2-1    NaN  0.000   NaN  NaN   0.0    NaN  0.000    NaN  0.000   NaN   NaN    NaN       0   NaN  0.000   NaN  0.000   NaN  0.000
02:07:01.089      std_2-1
02:07:01.090      max_2-1
02:07:01.204
02:07:01.205 Miscellaneous statistics:
02:07:01.205 (These statistics do not include activity between the last reported interval and shutdown.)
02:07:01.205 DIRECTORY_DELETES   Directories deleted:                          3,905      3,905/sec
02:07:01.205
02:07:01.550 Vdbench execution completed successfully. Output directory: /home/tester/vdbench_test/output

$ ls -la /mnt/bench1/vdb.1_1.dir/vdb.2_1.dir/vdb.3_1.dir/vdb.4_1.dir/vdb.5_1.dir/
ls: cannot access /mnt/bench1/vdb.1_1.dir/vdb.2_1.dir/vdb.3_1.dir/vdb.4_1.dir/vdb.5_1.dir/: No such file or directory

Reference

The /proc/diskstats file displays the I/O statistics of block devices.

Here we have a system which has one disk sda used by Linux operating system and two disks sdb sdc for other purpose.

$ cat /etc/centos-release
CentOS Linux release 7.9.2009 (Core)

$ uname -r
3.10.0-1160.11.1.el7.x86_64

$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0   128G  0 disk
├─sda1   8:1    0   3.7G  0 part /boot
└─sda2   8:2    0 124.3G  0 part /
sdb      8:16   0    60G  0 disk
sdc      8:32   0    60G  0 disk

The following is the content of /proc/diskstats file.

$ cat /proc/diskstats
   8       0 sda 20076 3 798556 75864 23573766 4732398 253002144 24887356 0 9814563 24947775
   8       1 sda1 143 0 22434 681 44 21 21664 588 0 397 1269
   8       2 sda2 19905 3 774042 75034 23573722 4732377 252980480 24886768 0 9814404 24946360
   8      32 sdc 88 0 4160 4 0 0 0 0 0 4 4
   8      16 sdb 88 0 4160 2 0 0 0 0 0 2 2

Taking the sda line for example, the following explains the meaning of each column.

  • 8 - major number
  • 0 - minor number
  • sda - device name
  • 20076 - reads completed successfully
  • 3 - reads merged
  • 798556 - sectors read
  • 75864 - time spent reading (ms)
  • 23573766 - writes completed
  • 4732398 - writes merged
  • 253002144 - sectors written
  • 24887356 - time spent writing (ms)
  • 0 - I/Os currently in progress
  • 9814563 - time spent doing I/Os (ms)
  • 24947775 - weighted time spent doing I/Os (ms)

You may see more columns from the /proc/diskstats file if the kernel version is 4.18+ or 5.5+. For the detailed explanation, please refer to the documentation.

Introduction

dstat is a versatile tool for generating system resource statistics. It can be a replacement for vmstat, iostat and ifstat. It overcomes some of the limitations and adds some extra features.

Dstat allows you to view all of your system resources instantly, you can eg. compare disk usage in combination with interrupts from your IDE controller, or compare the network bandwidth numbers directly with the disk throughput (in the same interval).

Usage examples

Display statistics of major OS components:

Image

Relate disk-throughput with network-usage (eth0), total CPU-usage and system counters:

Image

Check dstat’s behaviour and the system impact of dstat:

Image

Use the time plugin together with cpu, net, disk, system, load, proc and top_cpu plugins:

Image

Display stats of process using most of the CPU or memory:

Image

Display stats of process using most of the CPU and memory:

Image

Relate cpu stats with interrupts per device:

Image

Display information that was to be displayed by vmstat tool:

Image

Display the list of all plugins:

root ~$ dstat --list
internal:
    aio, cpu, cpu24, disk, disk24, disk24old, epoch, fs, int, int24, io, ipc, load, lock, mem, net, page, page24, proc, raw, socket, swap, swapold, sys, tcp, time, udp, unix, vm
/usr/share/dstat:
    battery, battery-remain, cpufreq, dbus, disk-tps, disk-util, dstat, dstat-cpu, dstat-ctxt, dstat-mem, fan, freespace, gpfs, gpfs-ops, helloworld, innodb-buffer, innodb-io, innodb-ops, lustre, memcache-hits, mysql-io, mysql-keys, mysql5-cmds, mysql5-conn,
    mysql5-io, mysql5-keys, net-packets, nfs3, nfs3-ops, nfsd3, nfsd3-ops, ntp, postfix, power, proc-count, qmail, rpc, rpcd, sendmail, snooze, squid, test, thermal, top-bio, top-bio-adv, top-childwait, top-cpu, top-cpu-adv, top-cputime, top-cputime-avg, top-int,
    top-io, top-io-adv, top-latency, top-latency-avg, top-mem, top-oom, utmp, vm-memctl, vmk-hba, vmk-int, vmk-nic, vz-cpu, vz-io, vz-ubc, wifi

Force float values to be printed:

root ~$ dstat --float 2
You did not select any stats, using -cdngy by default.
----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
usr sys idl wai hiq siq| read  writ| recv  send|  in   out | int   csw
4.5 1.8  94 0.1   0 0.1| 251B 76.1k|   0     0 |   0     0 |4619  8021
2.3 1.1  96 0.1   0 0.1|   0  54.0k|15.9k 7215B|   0     0 |3497  6381
5.3 2.4  92 0.1   0 0.1|   0   112k|18.1k 17.4k|   0     0 |5323  9184
9.4 2.3  88 0.1   0 0.3|   0   116k|24.5k  193k|   0     0 |5370  9091
2.3 1.3  96 0.1   0   0|   0  82.0k|18.0k 10.6k|   0     0 |3448  6127 ^C

Display the time, cpu, mem, network, disk, system load, process stats with 2 seconds delay between 5 updates, and save output to a csv file:

Image

Learn more about how the dstat is implemented:

Check out the source code here. Credit to Dag Wieers.

Reference

0%