vm.dirty_bytes Pop!OS customization trashes BTRFS performance

**Distribution (run `cat /etc/os-release`):**
 
~~~bash
cat /etc/os-release
NAME="Pop!_OS"
VERSION="20.04 LTS"
ID=pop
ID_LIKE="ubuntu debian"
PRETTY_NAME="Pop!_OS 20.04 LTS"
VERSION_ID="20.04"
HOME_URL="https://pop.system76.com"
SUPPORT_URL="https://support.system76.com"
BUG_REPORT_URL="https://github.com/pop-os/pop/issues"
PRIVACY_POLICY_URL="https://system76.com/privacy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
LOGO=distributor-logo-pop-os

uname -a
Linux oryx 5.11.0-7612-generic #13~1617215757~20.04~97a8d1a-Ubuntu SMP Thu Apr 1 21:15:20 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
~~~
 
**Related Application and/or Package Version (run `apt policy $PACKAGE NAME`):**

~~~bash
apt policy pop-default-settings 
pop-default-settings:
  Installed: 4.0.6~1611854075~20.04~6a2277e
  Candidate: 4.0.6~1611854075~20.04~6a2277e
  Version table:
 *** 4.0.6~1611854075~20.04~6a2277e 1001
       1001 http://ppa.launchpad.net/system76/pop/ubuntu focal/main amd64 Packages
       1001 http://ppa.launchpad.net/system76/pop/ubuntu focal/main i386 Packages
        100 /var/lib/dpkg/status
~~~

**Issue/Bug Description:**

Commit 6a2277e02efae1d3df642ae1cf26383e6e8a81f6 reports:

> # fix: Set reasonable size for dirty bytes parameters
>
>The kernel default is to buffer up to 10% of system RAM before flushing writes to the disk, which is insane. By setting a reasonable number of bytes for the `dirty_bytes` parameter, we can avoid sending the system into OOM during a large file transfer.
>
> https://lwn.net/Articles/572911/
>
> ~~~diff
> diff --git a/etc/sysctl.d/10-pop-default-settings.conf b/etc/sysctl.d/10-pop-default-settings.conf
> index 987317f..0430a48 100644
> --- a/etc/sysctl.d/10-pop-default-settings.conf
> +++ b/etc/sysctl.d/10-pop-default-settings.conf
> @@ -1 +1,3 @@
>  vm.swappiness = 10
> +vm.dirty_bytes = 16777216
> +vm.dirty_background_bytes = 4194304
> ~~~

Unfortunately this fix has the unintended side effect of completely trashing the performance of COW filesystems like BTRFS for regular use as rootfs/home on fast SSDs!

No penalty is observed when when writing large files to a BTRFS partition, but it has very negative effects on operations that do many small writes, like touching metadata on a `btrfs receive` operation or even just when writing a lot of small files (e.g. untarring a big archive with complex directory structure).
It can take up to 20 times the wall-clock time of running the same operation commenting out this change (which reverts to the default `vm.dirty_ratio =20` and `vm.dirty_background_ratio = 10`).

When using BTRFS as rootfs and home, this is even worse, as operations as simple as `apt update` (or packagekit doing it in the background for you), `apt upgrade` but also just firefox/chrome regular operation (which can do frequent writes to the local on disk cache) can result in freezes lasting from some seconds to a few minutes where the CPU is stuck in iowait and all processes on the scheduler waiting for kernel triggered IO-trashing to be over.
Operations where the user is intentionally doing a lot of writes are even worse: compiling big projects, cloning a moderate or big git repo locally, using `ccache` become just unbearable!

My suggestion is to revert this change, or find a different compromise that manage to fix the occasional OOM problems writing big files to slow block devices, without making it impossible to do many small writes to fast devices.

The comments on the LWN article linked in the original commit are quite enlightening on the fact that similar problem on COW filesystems were anticipated following this path and that it might be difficult to strike a good balance without reworking the issue with actual kernel changes that would make these sysfs knobs superfluos.


**Steps to reproduce (if you know):**

1. create a BTRFS partition on a fast SSD
2. mount it (I am using options `defaults,noatime,compress=zstd` but they are not particularly relevant, you can test with or without)
3. have separate terminals where you are running `iotop` and `htop` to examine CPU and IO utilization, alternatively you can also use `sysstats` to collect the data and visualize it afterwards
4. time (tar -xpf some_large_and_complex_archive.tar --acls --xattrs -C /path/to/mountpoint ; sync )
5. unmount the BTRFS partition
6. sudo sysctl vm.dirty_ratio = 20; sudo sysctl vm.dirty_backgroud_ratio = 20;
7. redo 1-4
8. look at the difference between the spent time for the tar extraction in the 2 cases

**Expected behavior:**

Using Pop!OS on a BTRFS root filesystem should be usable, and its performance not crippled to avoid rare corner cases when writing large files to slow devices.

**Other Notes:**

My sample `.tar` to debug the performance issues I was seeing, that finally brought me to isolate commit 6a2277e02efae1d3df642ae1cf26383e6e8a81f6 as the root cause, was a backup of my old rootfs partition: it doesn;t need to be huge, anything that contains a lot of files, with a lot of associated metadata, will work.
Actually the smaller the ratio between total archived data size and number of files and metadata, the more the difference should be visible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vm.dirty_bytes Pop!OS customization trashes BTRFS performance #111

fix: Set reasonable size for dirty bytes parameters

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

vm.dirty_bytes Pop!OS customization trashes BTRFS performance #111

Description

fix: Set reasonable size for dirty bytes parameters

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions