0

I have a workload similar to the following:

while True:
    data = get_data_from_network();
    filename = sha1(data);
    write_to_file(filename, data, data.size());

Occasionally I read back from the file, but it's not very common. Importantly, I get a lot of these network requests. It's not uncommon for me to a gigabyte of data out to the disk this way. So for the most part I'm effectively just streaming large volumes of data to the disk. There is this article from Raymond Chen where he advises the customer not to use the flag, because as Raymond puts it:

If the application reads back from the file, the read can be satisfied from the disk cache, avoiding the physical I/O entirely

But I'm not sure if this applies to me, because depending on the size of the cache, there's a pretty good chance that by the time I go to read that data again, it's already been pushed out by some other data.

I can bypass this with FILE_FLAG_NO_BUFFERING when I call CreateFile(), but before I just go and blindly do this, I'm wondering how can I investigate the impact of this from a performance point of view. I can just time my application, sure, but I'd like to go deeper.

For starters, how big even is the OS cache? Is it per-process, per-file, global? Is the size configurable? Can I query its size programatically via an API? Is there a way for me to investigate if it's being thrashed due to my workload? Is there a way to run my program and then determine how many disk reads were served from the memory cache as opposed to from the physical media?

5
  • The disk cache is good for more than just satisfying reads. It also lets the write return back to your program immediately so you can do more work (like more network-data-getting). Commented Oct 10, 2023 at 22:05
  • The hardware has a buffer that does the same thing no? That's what FILE_FLAG_WRITE_THROUGH is for (to tell the hardware to also flush its buffer). I'm considering this option because I've observed in practice that in some cases (not on Windows, admittedly) that I'm definitely thrashing the cache and it can lead to dramatically reduced performance without the Linux equivalent O_DIRECT. Commented Oct 10, 2023 at 22:19
  • 1
    You can use Performance Counters to test these, for example \Storage Spaces Write Cache\Read Bypass %, \Storage Spaces Write Cache\Cache Size etc. Commented Oct 11, 2023 at 2:59
  • @ZacharyTurner: Well no, the hardware write cache doesn't let the write return immediately. Early yes, but nowhere near as quickly as the OS write cache. Also, the OS one is orders of magnitude larger than the disk controller's cache, and the OS buffer allows the transfer to the disk controller to be made more efficient (e.g. native command queuing). Commented Oct 11, 2023 at 15:52
  • @YangXiaoPo-MSFT thanks, if you post that as a top-level response I'll accept it as an answer. Commented Oct 11, 2023 at 18:18

2 Answers 2

1

You could use Windows Performance Toolkit which is part of the Windows SDK to analyze ETW data. Recording is easy:

wpr -start CPU -start DiskIO -start FileIO 
Execute use case but record no longer than a few minutes because oldest data is overwritten in a Ring Buffer.
wpr -stop c:\temp\IOTrace.etl

Then you can analyze the data in WPA. For you the most important ones are File I/O and Disk Usage.

enter image description here

Disk is showing actual (uncached) hard disk accesses while File IO shows all file operations regardless if they were cached or not. If you flush the cache you would see later high disk IO due to reading data which could previously be cached. Windows caches all files which were read in the Standby list which is basically the free memory. If you allocate all physical memory then also no disk cache is there. You can see the size in Task Manager by looking at

enter image description here

the Memory tab and hover over the third region. To see the actual file system contents you can use Rammap from SysInternals which can show which files are stored in the Standby and other OS managed lists. enter image description here

A more detailed explanation about the ETW view can be found at https://aloiskraus.wordpress.com/2016/10/09/how-buffered-io-can-ruin-performance/

Sign up to request clarification or add additional context in comments.

Comments

0

As I said in the comment, you can use Performance Counters to collect disk statistics such as \Storage Spaces Write Cache\. Although the document warns that

Windows Performance Counters are optimized for administrative/diagnostic data discovery and collection. They are not appropriate for high-frequency data collection or for application profiling since they are not designed to be collected more than once per second.
...
For profiling, you might collect ETW logs with system profiling data using tracelog.exe with -critsec, -dpcisr, -eflag, or -ProfileSource options, or you might use Hardware Counter Profiling.

System Providers do not complete documenting and also are not well documented.

C:\Windows\System32\perfmon.exe "C:\Windows\System32\perfmon.exe"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.