Skip to content

Maintaining and Identifying Duplicates in library

whatdoineed2do edited this page May 5, 2022 · 1 revision

Maintaining a growing digital music library can be painful with duplicate tracks being one the main issues. When using owntone to serve your music, you have the benefit that its backed by a database that you can query.

Identifying potential duplicates based on title and artist

On a Raspberry Pi installation, we can:

$ sqlite3 /var/cache/owntone/songs3.db
.load /usr/lib/arm-linux-gnueabihf/owntone/owntone-sqlext.so
.output /tmp/dupls.txt
    WITH cte AS
    (
        SELECT title,artist,count(*) c
          FROM files
      GROUP BY title,artist
        HAVING c > 1
    )
    SELECT t.title,t.path,t.bitrate,t.title,t.artist,t.album
      FROM files t
INNER JOIN cte
        ON cte.title = t.title AND
           cte.artist = t.artist
  ORDER BY t.title,t.artist,t.bitrate DESC;
.output

Examine the output file, /tmp/dupls.txt and determine which tracks/files can be deleted

Consistent naming of files based on meta

exiftool can be used to rename your directory structure based on metadata (ie artist, album, title)

$ exiftool -r -ext mp3 -ext flac '-Directory<$Artist/$Album' /export/music
$ exiftool -r -ext mp3 -ext flac '-filename<$Track - $Title.%le' /export/music

Be aware that metadata with / will result in directories, such as "AC/DC".

Clearing duplicates

Files that identical can be easily cleared with fdupes

$ fdupes -rdNsI /export/music
$ find /export/music -type d -empty -delete

Finding Tracks with Malformed Titles etc

This can happen if non-Latin metadata is encoded in wrong charset

# only traps meta with junk chars at start
$ sqlite3 /var/cache/owntone/songs3.db
    SELECT path,album,artist
       FROM files
      WHERE UNICODE(title) > 128 AND
            UNICODE(title) < 4000
   GROUP BY songalbumid
...
/export/music/foo/bar/ηM¤£¥i¤(¦X°Ûª©).mp3|bar|foo