1 minute read

When copying data from USB devices in Linux (Debian / Ubuntu), you may have noticed that reading data from the disk the first time takes a while, and reading the second time takes only a few seconds.

For example:

[email protected] /media/joshua/ucdntfs $ time sudo md5sum /dev/sdf1
3a698f0c3155e494274e5e7829f4d246 /dev/sdf1

real 2m58.620s
user 0m7.032s
sys 0m1.429s

[email protected] /media/joshua/ucdntfs $ time sudo md5sum /dev/sdf1
3a698f0c3155e494274e5e7829f4d246 /dev/sdf1

real 0m3.467s
user 0m3.285s
sys 0m0.181s

Here the first read took 2 minutes 58 seconds, while the second took only 3 seconds.This is because all data on the disk is cached to memory when read the first time. In cases where the disk may change between reads, caching may return results that are not consistent with the current state of the disk (like a hash).

When looking how to disable read cache, I found a lot of information about disabling write cache, but not a lot about disabling read.

To disable write cache (if supported) for the current session that the device is plugged in:

sudo hdparm -W 0 /dev/[device]

But this does not solve our read cache problems. Unfortunately, I could not find a way to completely disable read cache, but we can clear the cache buffer.

First, determine the path to echo with

which echo

Then we want to tell the kernel to drop caches. To do this, we need to echo a value to /proc/sys/vm/drop_caches.
To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
To free reclaimable slab objects (includes dentries and inodes):
echo 2 > /proc/sys/vm/drop_caches
To free slab objects and pagecache:
echo 3 > /proc/sys/vm/drop_caches

So our echo command to clear all caches should look like:

sudo sh -c "/bin/echo 3 > /proc/sys/vm/drop_caches"

Note: You probably cannot echo directly to drop_caches with sudo - you should be root. The work-around to that is wrap the whole command in sudo. Make sure you are putting the full path to echo on your system.

[email protected] /media/joshua/ucdntfs $ time sudo md5sum /dev/sdf1
3a698f0c3155e494274e5e7829f4d246 /dev/sdf1

real 3m18.294s
user 0m6.389s
sys 0m1.390s

[email protected] /media/joshua/ucdntfs $ sudo sh -c "/bin/echo 3 > /proc/sys/vm/drop_caches"

[email protected] /media/joshua/ucdntfs $ time sudo md5sum /dev/sdf1
3a698f0c3155e494274e5e7829f4d246 /dev/sdf1

real 3m18.344s
user 0m6.545s
sys 0m1.438s

If you use it a lot, like me, you might want to make an alias:

alias clearusbcache="sudo sh -c '/bin/echo 3 > /proc/sys/vm/drop_caches'"

If you want to clear the cache in the background while running experiments, you can try this script:

while true; do
/bin/echo 3 > /proc/sys/vm/drop_caches
sleep 1