I recently had to shrink around 50GB of MP3 audio recordings that were sitting in a nested folder structure on a web server. Having experimented to find more appropriate LAME encoder settings for spoken word content I needed to transcode the files whilst keeping the existing ID3 tags intact. FFmpeg can do this nicely using libmp3lame, whereas LAME by itself cannot. Armed with my own compile of FFmpeg, I created a drag & drop batch script to recursively work through a folder structure transcoding MP3 and WAV files and writing out the resulting MP3 files in the same folder structure with an amended top level folder name. It will accept multiple files or folders being dragged and dropped. You could adapt this script to whatever task you’re using FFmpeg for.
::MP3 transcoding to more sensible default quality settings for spoken word
::32kHz mono VBR @ approx. 64Kbps
::parse multiple command lines (so multiple targets can be dragged and dropped at once)
if "%~1"=="" goto :continue
call :mainloop %1
::main loop we run for each command line arg of the script
if "%~1"=="" (
echo Drag and drop files or folders onto this script.
echo "%1" | find /I ".mp3" && (
echo "%1" | find /I ".wav" && (
echo Processing folder %1
::we need to find the last folder level in the folderpath
::batch scripting is horrible so we need to use a hacky way to get the last token
::iterate parsing the tokens and trimming the string until it's empty, while counting how many loops
for /F "tokens=1* delims=\" %%a in ( "%temp_string%" ) do (
set /A count+=1
for /F "tokens=%count% delims=\" %%i in ( "%~1" ) do set folder=%%i
::append the top level folder name with "-optimized" so we create a new folder tree
for /R "%folderpath%" %%i in (*.mp3 *.wav) do (
ffmpeg -i "%%~fi" -acodec libmp3lame -aq 8 -ar 32000 -ac 1 "!destpath!%%~ni.mp3"
for %%i in ("%folderpath%") do (
ffmpeg -i "%folderpath%" -acodec libmp3lame -aq 8 -ar 32000 -ac 1 "%%~di%%~pi%%~ni-optimized.mp3"
There are several neat little tricks here. Multiple folders can be dragged onto the script because it uses SHIFT to work through each command line argument. The next hurdle is to get the last part of the folder name(s) being dragged. Though this would be simple in any Unix-like shell, but to do this in batch without relying on any additional tools proved to be quite tricky. Though tokens can be parsed by FOR, finding the last one (which we need) requires us to count how many tokens there are. Only then can the last one be selected. The folder name string substitution is done by the SET command (also used in my file renaming script). Delayed variable expansion means that the variables between the the exclamation marks are evaluated once per loop rather than the default of once per script execution (more info here). I hadn’t realised until today that you can create a folder structure several layers deep with a single MD command. This avoids having to iterate through all subfolders – we can use FOR’s /R switch to handle the recursion in a single line. For more information on some of the variables containing tildes like %%~pi and %%~ni, try running FOR /?. The script also first runs the FFmpeg subroutine with ECHO in front of the commands so you can double-check the syntax before proceeding.
There are of course much more efficient ways of compressing spoken word content than MP3 these days – AAC-HEv2 for instance, but that will rule out all but the latest audio playback devices.
The source MP3 recordings were in 44.1kHz mono @ 128Kbps. This was definitely overkill for speech. Just by opting for 64Kbps I could halve that. However I couldn’t drop the sample rate below 32kHz, as this is the lowest legal limit of MPEG-1 layer III. This info is tricky to find but the LAME encoder reveals it in its extended help:
MPEG-1 layer III sample frequencies (kHz): 32 48 44.1
bitrates (kbps): 32 40 48 56 64 80 96 112 128 160 192 224 256 320
MPEG-2 layer III sample frequencies (kHz): 16 24 22.05
bitrates (kbps): 8 16 24 32 40 48 56 64 80 96 112 128 144 160
MPEG-2.5 layer III sample frequencies (kHz): 8 12 11.025
bitrates (kbps): 8 16 24 32 40 48 56 64
I think it’s safe to assume that most MP3 players can handle variable bit rate. As far as I can remember only the first generation ones couldnt from early last decade – the sort of ones which only had around 64MB of storage. I don’t think excluding players this old is much of a concern.
Playing around with VBR quality settings in LAME we can see that with -V 9 we get a small file (12.9MB becomes 3.7MB) however it’s actually an MPEG-2 layer III file which as you can see above supports lower sample rates. I’m not certain of the wider compatibility of this sub-type of MP3 file, and besides it does sound considerably worse than the lowest MPEG-1 layer III settings which is –V 8 (12.9MB becomes 5.2MB). This proved to be the best compromise between size and quality.