Focal point cropping: a more technical explanation

This is a crop operator that preserves the focal point of the image.

How it works?

It basically figures out the parameters of PIL.Image.crop using kiwisolver. That means that you can enter any combination of the parameters we will explain below, and our algorithm will find the best rectangle we should use to crop the image.

Parameters

The data structure looks like this:

{
   "focal-point-crop": {
      "focal-point": {
          "x": "<coord>",
          "y": "<coord>"
      },
      "crop-rims": {
          "<rim-name>": "<rim-crop-spec>"
      },
      "total-width": "<total-size>",
      "total-height": "<total-size>"
   }
}

where everything between angle brackets is a grammar element; in the case of crop-rims, the user can input a spec for each side. The components total-width and total-height are also optional.

Here the rest of the grammar:

coord: 
     percent
    |pixel_size

total-size:
     percent
    |pixel_size

percent: 
     FLOAT_NUMBER '%'

pixel-size:
     INTEGER_NUMBER 'px'

rim-name:
      'inner-'side
    | 'outer-'side

side:
      'top'
    | 'bottom'
    | 'left'
    | 'right'

rim-crop-spec:
    FLOAT_NUMBER rel unit 

rel: 
    '%' 'of'
    | 'times'

unit:
      'total-width'
    | 'total-height'
    | 'distance-to-focus'
      

Implementation details

We just figure out the parameters of PIL.Image.crop, and the method can do the cropping for us. Note the input parameter of the method is a “box” with left, upper, right, and lower pixel coordinate of the rectangle we are going to leave. For the exposition below, let’s call those magnitudes r_left, r_upper, r_right and r_lower.

Now, how do we go from the array of possibilities above to the crop parameters?

We write equations depending on what the user specifies, and use kiwisolver to find a solution to them.

Kiwisolver allows to assign different importance to the constraints, so that if the problem is under-specified or over-specified, we can still find a solution.

Base case (very under-constrained)

Let me explain this with an easy example, the case where the user only specifies the focal point but nothing else:

{
   "focal-point-crop": {
      "focal-point": {
          "x": "300px",
          "y": "400px"
      }
   }
}

In this case we don’t know what to do so we won’t crop the image. Let’s build an equation system that does nothing for this case:

r_left == 0
r_upper == 0
r_lower == o_height
r_right == o_width
...

where o_height and o_width are the input’s image width and height.

These sets of constraints don’t change the image size, and we can always add them to the constraint solver with the lowest priority possible.

The user adds one constraint!

What if the user specifies total_height or total_width? Let’s write the constraint for the case when total_width is included, and you can imagine the other one.

This would be the json by the user:

{
   "focal-point-crop": {
      "focal-point": {
          "x": "300px",
          "y": "400px"
      }
   },
   "total-width": "1000px"
}

and this the extra constraint:

r_left + total_width == r_right

Note that this constraint is no longer compatible with r_right == o_width above, so we need to use a higher kiwisolver strength for it.

Also note that the constraint will be different if the user specifies a percent constraint:

{
   "focal-point-crop": {
      "focal-point": {
          "x": "300px",
          "y": "400px"
      }
   },
   "total-width": "50%"
}

In this case, we would use:

total_width = 0.5 o_width
r_left + total_width == r_right

So we centered into find the right equations and the right strengths for each type of constraint that the user can specify.

About the rims

  • inner-left means distance from the focal point to the square where the image is cropped.
  • outer-left means the distance from the image’s border to the square where the image is cropped.

and so forth for the other sides.

  • distance-to-focus mean the distance from the original image border (before cropping) to the focus. It is something that we automatically calculate.

The constructions % of and times allow the user to specify fractions as either a percent of, or as a 0 to 1 fraction.

Updated: