RomVault, Retool, fixDat & Digital Preservation

💡 Tip

A TLDR is available as Rom Update TLDR.

Foreword

For years forgotten I’ve been using RetroArch and ROMs management can be a PITA at times.

During my adolescence, I somehow found a Chinese ROM site namely OldmanEmu and the nightmare would still haunt me sometimes: I know nothing about programming besides bare HTML and CSS.1 As Macromedia Flash (it has not been acquired by Adobe when I started using it) began to sink, and I’ve yet to learn Visual Basic just a few years later, and I’ve been using adware of popular simulator clones (with Chinese support at least) which do not support loading zipped ROMs, I have to unzip every ROM I downloaded.

To make matters worse, the crappy unarchive program I use cannot detect non-ASCII filenames,2 which means I have to manually extract thousands of ROMs, one by one, change garbled text into normal filename, one by one and merge them into one folder as the zip comes with a pointless directory.

For two or three nights, I’ve been doing this unnecessary yet tedious shit and unfortunately, by the time I finished this, I no longer have the passion to play retro games.

Of course things have become easier and easier these days, most Gen Z can’t even imagine the fact that installing a MS-DOS game from 3.5 inch floppy disks can be challenging. I’ve cursed a thousand times on Starforce (notorious DRM) developers.

Every now and then I’ve been amazed by the enormous options RetroArch offers. A few years ago I discovered No-Intro & Redump along with playlist, in 2022 I discovered RetroAchievements and NetPlay. And I’m always wondering how MTL or TAS in RetroArch works (since it never worked for me).

So many talk, let me get to the point.

RomVault

I assume you already know what No-Intro & Redump are and have a local romset. Actually I should have known this earlier, so I don’t have to download & manually compare ROMs over and over again every time I updated the romset to match a newer DAT version. I hope you do so just leave a few outlinks in case you don’t know it:

Rsync

And as I rediscovered good old rsync again recently, and fortunately there is some ROM hosting site providing anonymous rsync server.3 So let’s say the server is rsync://rsync.example.com/files, you can sync missing ROMs to your local copy with these commands:

ROMSET="Sega - Mega Drive - Genesis"
cd path/to/RetroArch
rsync -avP --ignore-existing --append --update --max-size="200m" "rsync://rsync.example.com/files/No-Intro/${ROMSET}/" "roms/No-Intro/${ROMSET}/"

The romset update workflow:

  • Download newer DAT Files
  • Click buttons in the sidebar of RomVault
  • rsync with a remote server to get missing ROMs, if you want to delete redundant ROMs during syncing, add --delete parameter to rsync command (RomVault would identify those if you don’t do it so relax)
  • Open ToSort directory of RomVault and check ROMs here
  • Update playlist in RetroArch
  • Done

Retool

So I somehow discovered retool the next day, which is said to be a better filter tool for Redump and No-Intro DAT files. I’m lazy so just leave another set of links:

And more robust and low effort fixDat (update) workflow:

  1. Select a platform with missing ROMs in RomVault, right click and choose Save fix DATs
  2. Use jc and jq to parse and format standard name in No-Intro/Redump of missing ROMs
  3. Use rsync to do delta update

And step 2 & 3 can be replaced by my script fixDat.sh:

#!/bin/bash
# Prerequisites: jc, jq & rsync

cd path/to/fixDat_blahblah.dat || exit
mv fixDat_*.dat fixDat.dat
# RomVault has a cutoff description bug in fixDATs
#jc --xml < fixDat.dat | jq '.datafile.game[] | .description | . + ".zip"' | sed 's/"//g' > fixDat.txt
# So use @name instead, but this could be wrong if you use local names in Retool
jc --xml < fixDat.dat | jq '.datafile.game[] | .["@name"] | . + ".zip"' | sed 's/"//g' > fixDat.txt
cp fixDat.txt path/to/RetroArch/fixDat.txt
ROMSET=$(jc --xml < fixDat.dat | jq '.datafile.header.description | sub("FixDat_"; "")' | sed 's/"//g')

cd path/to/RetroArch || exit
rsync -avP --ignore-existing --append --update --max-size="200m" --include-from="fixDat.txt" --exclude="*" "rsync://rsync.example.com/files/No-Intro/${ROMSET}/" "roms/No-Intro/${ROMSET}/"
rm fixDat.txt

Some boring theory you probably don’t want to see:

I noticed that rsync has a --exclude-from param so it’s natural to have an opposite --include-from. Judging from rsync using include-from to sync only specific files or directories, order matters. You can’t --exclude="*" then --include-from as this would exclude all files.

Also, bare jc and jq commands jc --xml < fixDat.dat | jq '.datafile.game[] | .description' only generate a list of ROM names, I have to add .zip to the end, and remove the double quotes in jq output (which is a bit weird IMO…). It’s unnecessary to add **/ so I removed that (it would still work if you add it for fun).

Test file:

"Binary Monster 2 (Taiwan) (En) (Unl).zip"
"**/Digimon Amethyst (USA) (Unl).zip"
Digital Monster 2001 (Taiwan) (En) (Unl).zip
**/Ice Age II (Taiwan) (En) (Unl).zip
"King of Fighters R2 (Taiwan) (Unl)"
"Loppi Puzzle Magazine - Kangaeru Puzzle Dai-2-gou (Japan) (Rev 1) (SGB Enhanced, GB Compatible) (NP)"

Test result:

$ rsync -avP --ignore-existing --append --update --max-size="200m" --include-from="fixDat.txt" --exclude=
"*" "rsync://rsync.example.com/files/No-Intro/${ROMSET}/" "roms/No-Intro/${ROMSET}/"
Welcome to example rsync endpoint!
Visit us at https://example.com.

receiving incremental file list
./
Digital Monster 2001 (Taiwan) (En) (Unl).zip
        185,326 100%  108.57kB/s    0:00:01 (xfr#1, to-chk=1/3)
Ice Age II (Taiwan) (En) (Unl).zip
        794,700 100%  427.12kB/s    0:00:01 (xfr#2, to-chk=0/3)

sent 1,010 bytes  received 980,487 bytes  67,689.45 bytes/sec
total size is 980,026  speedup is 1.00

Digital Preservation

With the emergence of LLM, it’s getting easier to pollute the Cyberspace with AIGC. Every week I would watch new VNs on DLsite or Steam with my VNDB-Calendar. This morning I saw an AI generated poster for UnionPay and it even targets senior citizens.4 Tencent uses AI poster in their promotions, even the mascot of CCTV New Year’s Gala 2024 has an AI controversial. Man, what can I say? The mascots of 2008 Beijing Olympics are generally favorable, the mascot of 2010 Shanghai EXPO somehow resembles Gumby but it’s at least stylish, even the hilarious mascot of 2025 Osaka EXPO is, uh, at least iconic.

I wish to go back to the point but it seems I spent too much time on this so this is the end, abruptly.


  1. JavaScript is still a new thing IIRC, things like AJAX or jQuery don’t even exist. ↩︎

  2. This issue still exists even today. unzip cannot detect Shift-JIS encoding w/o a BSD patch or whatever last time I tried. The best cli one is unar (aka. Unarchiver). BTW, it does not support musl last time I checked as GNU Step does not support musl, and this is the only critical reason I have not gone fully musl-libc. ↩︎

  3. I won’t name which, but you can easily find it by searching ROM rsync↩︎

  4. I’m not against the involvement of AI but criticize this as the poster is in low quality. TBH I’m super interested in curating cyber model figures and probably one of the earliest to explore VITS, roughly two years before AI generated music went viral. ↩︎

Vinfall's Geekademy

Sine īrā et studiō