Avoid unnecessary writes when modifying non-media fields #6208

weiqianwang123 · 2025-12-08T02:17:14Z

Summary

Some beets commands such as:

beet modify -a onplayer=true queen

update only custom fields that are not mapped to media tags.
However, the current implementation still performs MediaFile.save(), causing:

unnecessary file writes

updated mtimes

20–30 second delays on large libraries

slow iPod sync / alternative device workflows

Root Cause

Item.write() filters _media_fields, but it always writes even if:

none of the media-tag fields changed

only flexible/custom database fields changed

Fix

Before writing, compute whether any media-tag-backed field changed.
If no media tag differences exist, skip the write entirely.

This preserves existing behavior for:

imports

replaygain

zero / scrub

metadata updates

Testing

A 0.5s sine-wave mp3 was generated via ffmpeg:

ffmpeg -f lavfi -i sine=frequency=440:duration=0.5 tiny.mp3
beet import tiny.mp3

Then:

Should NOT write to file

beet modify -a onplayer=true tiny

Should write to file

beet modify title="new title" tiny

Screenshots attached.

Performance improvement

Before: 25 seconds
After: <1 second
(repeated 200× batch modify)

DCO

All commits signed-off.

Related Issue: #5668

This patch prevents Item.write() from performing a mediafile.save() when no writable media tag fields have changed. Without this change, commands such as 'beet modify -a onplayer=true' trigger unnecessary file writes, updating mtimes and causing significant slowdowns. Fix verified with a tiny mp3 test case. Signed-off-by: Qianwei Wang <qweiw@umich.edu> Signed-off-by: weiqianwang123 <1416740298@qq.com>

github-actions · 2025-12-08T02:17:29Z

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

sourcery-ai

Hey there - I've reviewed your changes - here's some feedback:

Loading original = Item.from_path(self.path) unconditionally in try_sync adds extra I/O even when write is False; consider only constructing original when write is True and short‑circuiting earlier if not.
Using Item.from_path(self.path) to detect changes may throw if the file is missing or unreadable, whereas try_write previously handled its own errors; consider wrapping this in similar error handling or falling back to writing when comparison fails.
Instead of reconstructing the original item from disk for _media_tags_changed, consider comparing against the pre‑modify database state (e.g., via self._db.get_item(self.id)) to avoid extra file reads and keep the change detection purely metadata‑based.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Loading `original = Item.from_path(self.path)` unconditionally in `try_sync` adds extra I/O even when `write` is False; consider only constructing `original` when `write` is True and short‑circuiting earlier if not.
- Using `Item.from_path(self.path)` to detect changes may throw if the file is missing or unreadable, whereas `try_write` previously handled its own errors; consider wrapping this in similar error handling or falling back to writing when comparison fails.
- Instead of reconstructing the original item from disk for `_media_tags_changed`, consider comparing against the pre‑modify database state (e.g., via `self._db.get_item(self.id)`) to avoid extra file reads and keep the change detection purely metadata‑based.

## Individual Comments

### Comment 1
<location> `beets/library/models.py:1016` </location>
<code_context>
         (conditionally).
         """
-        if write:
+        original = Item.from_path(self.path)
+
+        # only write tags if media tags changed
</code_context>

<issue_to_address>
**issue:** Handle failures when loading `original` via `Item.from_path` to avoid breaking `try_sync` on IO/tagging errors.

Because `try_sync` now calls `Item.from_path(self.path)`, it can fail due to transient or unrelated IO/tagging issues (missing file, permissions, corrupt metadata) and abort the entire sync. Consider wrapping `Item.from_path` in a try/except and, on failure, either skip the change-detection optimization (always write) or log and continue, so `try_sync` degrades gracefully instead of raising unexpectedly.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-12-08T02:18:09Z

beets/library/models.py

        (conditionally).
        """
-        if write:
+        original = Item.from_path(self.path)


issue: Handle failures when loading original via Item.from_path to avoid breaking try_sync on IO/tagging errors.

Because try_sync now calls Item.from_path(self.path), it can fail due to transient or unrelated IO/tagging issues (missing file, permissions, corrupt metadata) and abort the entire sync. Consider wrapping Item.from_path in a try/except and, on failure, either skip the change-detection optimization (always write) or log and continue, so try_sync degrades gracefully instead of raising unexpectedly.

Copilot

Pull request overview

This PR aims to optimize the modify command by avoiding unnecessary file writes when only non-media fields (custom/flexible database fields) are changed. The goal is to reduce I/O operations, prevent unnecessary mtime updates, and improve performance significantly (from 25 seconds to <1 second for 200 batch modifications).

Key changes:

Added _media_tags_changed() method to compare media tag fields between two Item instances
Modified try_sync() to conditionally skip file writes when only database-only fields are modified
Introduced logic to detect whether media tag fields have actually changed before writing to disk

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-08T02:21:10Z

beets/library/models.py

        (conditionally).
        """
-        if write:
+        original = Item.from_path(self.path)


Item.from_path() can raise a ReadError exception if the file is unreadable, missing, or corrupted. This exception is not caught, which would cause try_sync() to fail unexpectedly.

This is particularly problematic because:

The method is called try_sync(), implying it should handle errors gracefully

File corruption or deletion could occur between modification and sync

The original code path didn't read from disk at this point

Consider wrapping this in a try-except block, or using a different approach that doesn't require re-reading the file.

Suggested change

original = Item.from_path(self.path)

try:

original = Item.from_path(self.path)

except ReadError as exc:

log.error("{}", exc)

return False

Copilot · 2025-12-08T02:21:11Z

beets/library/models.py

+        original = Item.from_path(self.path)
+
+        # only write tags if media tags changed
+        if write and self._media_tags_changed(original, self):
            self.try_write()


The new optimization logic in try_sync() lacks test coverage. Consider adding tests to verify:

Skipping writes when only flexible/custom fields are modified (e.g., onplayer=true)

Performing writes when media tag fields are modified (e.g., title="new title")

Handling edge cases like missing or corrupted files when checking for changes

Performance improvement is achieved (optional: benchmark test)

The existing WriteTest class in test/test_library.py would be a good place to add these tests.

Copilot · 2025-12-08T02:21:11Z

beets/library/models.py

            return False

+    def _media_tags_changed(self, original, modified):
+        for f in self._media_tag_fields:   # this is defined in the class already


The _media_tags_changed method lacks documentation. Consider adding a docstring to explain:

What the method does

The parameters original and modified (expected to be Item instances)

Return value (boolean indicating if any media tag fields differ)

Additionally, the inline comment "this is defined in the class already" is unnecessary and can be removed.

Suggested change

for f in self._media_tag_fields: # this is defined in the class already

"""

Determine if any media tag fields differ between two Item instances.

Parameters:

original (Item): The original Item instance.

modified (Item): The modified Item instance.

Returns:

bool: True if any media tag fields differ, False otherwise.

"""

for f in self._media_tag_fields:

Copilot · 2025-12-08T02:21:11Z

beets/library/models.py

        (conditionally).
        """
-        if write:
+        original = Item.from_path(self.path)


Calling Item.from_path() here defeats the performance optimization. from_path() creates a new Item and calls read(), which opens and parses the entire media file using MediaFile(syspath(read_path)). This is the expensive I/O operation the PR aims to avoid.

Instead, consider tracking which fields have been modified since the last read/write. For example:

Track dirty fields in a set when fields are modified via __setitem__

Check if any dirty fields intersect with _media_tag_fields

Only write if media tag fields have been modified

Alternatively, compare self against the in-memory state before any modifications were made, rather than re-reading from disk.

weiqianwang123 requested a review from a team as a code owner December 8, 2025 02:17

Copilot AI review requested due to automatic review settings December 8, 2025 02:17

Copilot started reviewing on behalf of weiqianwang123 December 8, 2025 02:17 View session

sourcery-ai bot reviewed Dec 8, 2025

View reviewed changes

Copilot AI reviewed Dec 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid unnecessary writes when modifying non-media fields #6208

Avoid unnecessary writes when modifying non-media fields #6208

Uh oh!

weiqianwang123 commented Dec 8, 2025

Uh oh!

github-actions bot commented Dec 8, 2025

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Dec 8, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 8, 2025

Uh oh!

Copilot AI Dec 8, 2025

Uh oh!

Copilot AI Dec 8, 2025

Uh oh!

Copilot AI Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

-        for f in self._media_tag_fields:   # this is defined in the class already
+        """
+        Determine if any media tag fields differ between two Item instances.
+        Parameters:
+            original (Item): The original Item instance.
+            modified (Item): The modified Item instance.
+        Returns:
+            bool: True if any media tag fields differ, False otherwise.
+        """
+        for f in self._media_tag_fields:

Avoid unnecessary writes when modifying non-media fields #6208

Are you sure you want to change the base?

Avoid unnecessary writes when modifying non-media fields #6208

Uh oh!

Conversation

weiqianwang123 commented Dec 8, 2025

Should NOT write to file

Should write to file

Uh oh!

github-actions bot commented Dec 8, 2025

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant