We’ll start with opening a simple JPEG image of a dog with two frameworks Tensorflow 1.x and OpenCV. Please note that I am an employee of Intel and all information and opinions shown in the blog are my own and don’t represent those of my employer. Also, thanks to Tejas Pandey for the help in achieving consistency between OpenCV and Tensorflow. It will also show a way to make them work consistently. This blog will discuss one such use case in detail where OpenCV and Tensorflow show the differences in reading and resizing a JPEG image. It can take up multiple days to figure out what went wrong and could delay the project extensively. The blog “How Tensorflow’s tf.image.resize stole 60 days of my life” ( ) is a perfect example of this type of situation. Therefore, CV solutions developed in one framework may not work as expected in the other framework. In today’s rapid development of frameworks, every framework has its own way of handling images, each with its own specifications. For this, one needs to use a framework to open those images to do some processing on them. Modern Computer Vision (CV) is currently a hot field of research which involves largely working with images. Img_new_padded = cv2.copyMakeBorder(img_new, 0, pad_bot, 0, pad_right, borderType=cv2.BORDER_CONSTANT, value=0)Ĭv2.A dive into the differences in JPEG image read and resizing with OpenCV, Tensorflow and Pillow and also on how to make them consistent. Img_new = cv2.resize(img, (preferred_width, preferred_height)) Pad_right = get_nearest_larger(preferred_width) - preferred_width Preferred_width = round(preferred_height / height * width) Pad_bot = get_nearest_larger(preferred_height) - preferred_height Preferred_height = round(preferred_width / width * height) #gets the nearest larger 64, starting with 512 New_filename = output_directory + "/" + f_name.replace("."+str(get_file_ext(filename)), "."+output_ext).replace(" ", "_") # this will return a tuple of root and extension Also I only pad to the right and to the bottom since in my experience, those are the areas for which inpainting is usually most useful. I have chosen a color of white for padding since the Stable Diffusion WebUI tool uses a black mask and this way I can easily see what has been masked and what hasn’t. OpenCV automatically determines the file format based on the extension, so you do not need to specify it explicitly. Output images are renamed to _scaled.png. Most other image file formats are supported though (such as JPEG, PNG, BMP, TIFF, WEBP and others). OpenCV does not support the AVIF file format (read here) that’s why I included the option to exclude extensions to allow processing of directories which have mixed content, for example obtained by scraping the web. It demonstrates how OpenCV can be used to easily perform basic tasks such as opening a file, resizing, padding, and writing the output back to the filesystem. The script is straightforward and can be viewed below or here. This can be done with: “pip install opencv-python” or “conda install -c conda-forge opencv” (whichever package manager you prefer). Batch resize and pad images using Python and OpenCV Batch resize and pad imagesįirst you need to install OpenCV. I chose to use OpenCV because it is a widely-used, easy-to-use library with powerful capabilities. To automate this process for a large number of images, I wrote a Python script using OpenCV. However, this results in a size that is not a multiple of 64, so the image needs to be padded to 1536×512 in order to be processed by the model. If you have an image that is not a multiple of 64, like 599×205 pixels, you can maintain the aspect ratio by resizing it to 1496×512. Stable Diffusion (at least 1.5) works best with images of 512 pixels in width or height. In case of Stable Diffusion, multiples of 64 are required. When using AI models like Stable Diffusion, sometimes input images need to be of a specific size.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |