8
Some of the last O'Reilly ebooks I got have become really large (e.g. 60MB for the mobi version and 5.5MB for the epub version) - and these are computer science text books (not art books). The customer support claims that this is o.k. and they want to provide high quality images.
Is there software that can easily reduce the size of images in an epub or mobi by e.g. downsampling them? Is there a calibre plugin for this?
  • 1
    Interesting idea. Should be relatively simple to do if it doesn't exist, you don't have to analyse the HTML in the epub for that, just identify large images, scale them and reconstruct the ePub (zip) file with the images having the same name. Do you have a specific title (I still have my voucher for a free ebook from Oreilly that I got at EuroPython, so I might be able to try things out) – Anthon Aug 10 '16 at 5:33
  • @Anthon: The monster is "Cassandra The Definitive Guide Second Edition": The mobi has 60MB, the PDF 49MB and the epub 5.5MB. – Martin Schröder Aug 10 '16 at 22:08
  • .mobi files contain multiple versions of the ebook for all the different kinds of kindles, that's one reason why they are bigger – maxwell Aug 17 '16 at 0:28
  • perhaps break the epub into epub1epub2, etc? but how...? – Thufir Jul 8 '18 at 7:55

6 Answers

4
I had a similar situation: a 134 MB .mobi that I wanted to shrink.
Here's how I reduced the file size to 26 MB. The key to this approach is that an .epub is actually just a zipped web page with image files in it.
  1. Converted the .mobi to .epub with Calibre.
  2. Renamed the .epub to .zip.
  3. Extracted the .zip. Note the images subfolder. Mine had more than 1000 files in it.
  4. Used Photoshop (although any decent image editing app could do it) to do a batch job against the contents of that folder. Saved each file as the original file name, but as a low quality .jpg. (In my case, the original filenames were .jpeg, but Photoshop saved the new files as .jpg, so I had to delete the original .jpegs and rename the new ones from .jpg to .jpeg after the batch was completed. There's probably a way to avoid that step.)
  5. Re-zipped it.
  6. Renamed to .epub. I wanted an .epub as the end result, so I was done. But you could of course now use Calibre to convert to whatever format you want.
4
bash script to for bulk compressing images in big.epub to small.epub on Windows7 or Linux. You need to install cygwin on windows for (bash, unzip, zip, find, imagemagick).
  #!/usr/bin/bash
  USAGE="Usage: $0 big.epub small.epub"
  : ${2:?$USAGE}
  unzip -d $1-tmp $1   # unzip big.epub in tmp directory
  cd $1-tmp
  # Compress all images larger than NxN in tmp
  find . -type f  \
   -iregex  '^.*[.]\(jpg\|jpeg\|png\|gif\)$' \
   -exec mogrify -quality 85 -resize '400x400>' {} \;  
  zip -Duro ../$2 .   # Zip up the compressed tmp folder as small.epub
  cd .. 
  rm -rf $1-tmp   # Cleanup  
You can just get rid of all images in one step from epub file with winzip or zip, the tiny epub will still be readable without illustrations. Keep a backup of big.epub, if you want to see the images later on.
 zip -d big.epub  "\*.jpg"
4
Note: I intended to leave this as a comment for @mosh but didn't have the necessary rep. This is my own personal solution based on the solution he outlined. This will recursively scan a specified directory and compress files larger than 49MB, overwriting the original files. [I chose that size due to the fact that Google Play Books uploads are limited to 50MB in size, and that's the service I utilize most frequently, generally with uploads from my Humble eBook Bundle purchases.]
First file: shrinkemall.sh
#!/usr/bin/bash
USAGE="Usage: $0 directory"
: ${1:?$USAGE}
find $1 -size +49M \
  -exec ./shrinkemall1.sh "{}" \;
Second file: shrinkemall1.sh
unzip -d tmp "$1"
cd tmp
find . -type f -iregex '^.*[.]\(jpg\|jpeg\|png\|gif\)$' \
  -exec mogrify -quality 85 -resize '800x800>' {} \;
zip -Duro "../$1" .
cd ..
rm -rf tmp
Obviously you'd want to edit the parameters to suit the service and/or device you're targeting.
Thanks, @mosh !
1
The short answer is that with few exceptions you should NEVER need to scale down .mobi files because Amazon.com will do it for you.
I think 5.5 MB is a reasonable file size for an epub file. I don't know if it is a typo, but 60MB is a bit high for mobi files.
However, when you buy a .mobi file directly from the publisher, the preferred way to get it to your device is to upload it to Amazon's Personal Document Server which will feed it via wifi to your device.
See: http://www.pcmag.com/article2/0,2817,2484180,00.asp
For mobi files, I do NOT recommend trying to transfer it to your device via USB because it requires you to put a bloated file onto a device which may not be ready to process it.
Uploading it to Amazon.com first will let Amazon to send to device a file size appropriate for it.
I was testing an ebook before I published it on Amazon.com. The epub file had high quality graphics and was 7.6 MB. The Mobi file was 16mb, but after I uploaded it to the Kindle's Personal Document server, I think the Kindle e-ink device only needed to download a smaller 1.5 MB file size to the device. It looked perfect even at that smaller size. Amazon reduced the size and resolution of graphics to match the device.
(It's possible that O'Reilly has multimedia files which are causing the bloat, but i doubt it.)
Some authors and publishers have problems optimizing their graphics and file sizes, but the Oreilly guys producing those ebooks are the best in the industry. So they are providing a book optimized for several different platforms and devices.
If you ordered the ebook directly from Amazon, then Amazon would automatically download the file size appropriate for your device. But when you purchase the ebook directly from the publisher, you are buying essentially a file before it has been compressed by Amazon.com (that's why it is foolish to transfer the file via usb to your device).
I am a big believer in buying the non-DRM file from a publisher or Smashwords rather than buying it from Amazon.com, but if you do that, you need to upload it to your device.
By the way, as someone who produces ebooks, producing .mobi files are a real pain in the neck --although I appreciate that Amazon now is sending you a file size according to what your device can handle.
See also: this article explains why mobi files tend to be so huge and why it doesn't matter for the end user. https://www.52novels.com/from-the-ebook-formatting-files-why-is-my-mobi-file-so-huge/
FURTHER THOUGHTS: I occasionally have issues reading epub files with lots of graphics on Google Play Books or ibooks. Sometimes you're just going to have to live with the file size (especially if the graphics are a vital part of the content), but I trust O'reilly to do as much optimizing as possible. Occasionally you have self-published ebooks by amateurs which make no effort to keep file size down. This raises an interesting question: what file size would be difficult to read on current devices and reading software? Generally, if the epub is over 25 MB, I have found it cumbersome to read on a tablet.
  • 2
    Why would I offer .mobi files I have to Amazon for scaling? I never have done so with any of the .mobi files that I've got from OReilly, so how can amazon.com scale a .mobi file they never ever touched? – Anthon Aug 10 '16 at 7:33
  • If I understand you correctly, you are saying, generate a file from calibre or kindlegen and then transfer via usb to the device. The device can certainly read the file, but the file will typically be larger than it needs to.-- typically 2x the size. if you upload to amazon's personal document server though, the file size will be reduced when it has been served. I confess, I don't know how much or how well Calibre reduces .mobi files better than kindlegen; Calibre is used mainly for generating DIY ebooks and not for ebooks available for sale (so I haven't used it in a while). – idiotprogrammer Aug 10 '16 at 16:22
  • Anthon, I can virtually guarantee that O'reilly produces an epub and uses kindlegen to convert from epub to mobi. For this reason, the .mobi file will be very bloated-- kindlegen bloats the file by design. For this reason I would be surprised if O'Reilly recommends that customers transfer via usb instead of Amazon's own Personal Document service. – idiotprogrammer Aug 10 '16 at 16:25
  • 1
    Here's an article discussing this 52novels.com/… – idiotprogrammer Aug 10 '16 at 16:27
  • 1
    @idiotprogrammer: The description at 52novels.com is missing a major point: KindleGen generates hybrid .mobi files that contain: the old MOBI file, the new AZW3 file and the source file. Amazon will deliver either the older mobi format or the newer AZW3 file. (Customers with latest generation Kindles will get KFX files.) I don't know whether the O'Reiily publishing tools will strip the old MOBI file from KindleGen-generated hybrid mobi files but I doubt it. @Martin Schröder: If you're using KindleGen, add the -dont_append_source parameter to strip the source files. – Nemo XXX Aug 12 '16 at 17:32
1
calibre already has a tool for this, but it isn't a plugin. It's part of the ebook editor that installs with calibre. The editor will only edit files in the epub or azw3 format though. You can convert the mobi file to azw3 and then edit it. Converting it from mobi to azw3 may automatically downsize the file, but I'm not sure. Once you have the file opened in calibre's ebook editor, click on Tools, then click Compress images losslessly. This will bring up a dialog where you can accept the default compression or choose your own. The downside to this is that it only compresses images that are in the jpg format.