Is the extension in the image URL different from its actual format?

nicola · January 2, 2024, 6:17am

I downloaded an image from a URL like “https://www.xxxx.com/filename.jpeg,” anticipating it to be in JPEG format, which is compatible with the Computer Vision Annotation Tool (CVAT). However, upon retrieval, it was saved as either “filename.heif” or “filename.jpeg.heif.” Consequently, when attempting to create a task with this image, an error occurred since the HEIF format is not supported in CVAT. (CVAT automatically downloads images and generates a task when I submit image URLs.)

Given that I typically input over 1000 image URLs to create a task, identifying invalid URLs or incompatible images becomes a challenging task. Is there a method to determine the “actual format” solely by examining the image URL? Alternatively, is there a way to skip invalid URLs within CVAT?

preetpal · January 2, 2024, 7:28am

It’s a common issue where the extension in a URL doesn’t necessarily match the actual image format. Unfortunately, determining an image’s format just from its URL is not always reliable. In CVAT, there isn’t an inherent feature to automatically filter out images in incompatible formats. However, I can suggest a couple of possible solutions.

One practical solution is to write a script, preferably in Python, that can pre-check the images for you.
Secondly, you can use the Pillow library to download and verify each image’s format. If an image is not in JPEG format, the script can convert it or exclude it from your CVAT upload list. This requires some basic scripting knowledge, but it’s a one-time effort that can save you a lot of time in the long run. You’ll need Python and the Pillow library for this, both of which are straightforward to set up and use.

I hope this may help you, do let me know if you have more queries.

Thank you

nicola · January 3, 2024, 6:31am

While that sounds effective, it seems a bit complex for my current skills. Are there simpler alternatives?

preetpal · January 3, 2024, 10:22am

Certainly! If you prefer a less technical route, you might consider using an online image conversion tool. These services allow you to convert images to different formats, though they might not be suitable for very large numbers of images.
Another option is to explore the CVAT community for any plugins or tools that have been developed to tackle this problem. The CVAT user community might have already created solutions for such common issues, or you could propose this as a feature request in the community forums.