DaoAI World Server Inference Service
-----------------------------------

When connected to the internet, you can send HTTP requests to the DaoAI World server for inference. Below is the example code:

.. code-block:: python

    # import the inference-sdk
    from inference_client import InferenceHTTPClient
    # initialize the client
    CLIENT = InferenceHTTPClient(
    api_endpoint="https://api.dev.daoai.ca",
    api_key="XXXXXXXXXXXXXXXXXXX"
    )

    # Infer with TRAINED MODEL
    result = CLIENT.infer("YOUR_IMAGE.jpg",
    trained_model_uid="XXXXXXXXXXXXXXXXXXX",
    )

    # Infer with a PRETRAINED MODEL (Eg, auto_segment)
    result = CLIENT.infer("YOUR_IMAGE.jpg",
    pretrained_model_type = 'auto_segment'
    )


Trained Models
~~~~~~~~~~~~~~~~~~~~~~~~~

You can use any of the trained models for inference. You need to import the ``InferenceHTTPClient`` library, provide your account's ``API Key`` and the model's version ID (``trained_model_uid``), initialize the client, and then perform inference.

- Example code for using a model from your account:

.. code-block:: python

    # import the inference-sdk
    from inference_client import InferenceHTTPClient

    # initialize the client
    CLIENT = InferenceHTTPClient(
    api_endpoint="https://api.dev.daoai.ca",
    api_key="XXXXXXXXXXXXXXXXXXX"
    )

    # Infer with TRAINED MODEL
    result = CLIENT.infer("YOUR_IMAGE.jpg",
    trained_model_uid="XXXXXXXXXXXXXXXXXXX",
    )

You can find the model's ``API Key`` and ``trained_model_uid`` by clicking :ref:`Model Deployment` in your trained project.

.. note::
    Different versions of the model have different ``trained_model_uid``, so be sure to check carefully.

.. warning::
    The API Key is your credential for remote access to DaoAI World. To ensure your data security, please store your API Key safely to prevent unauthorized use.

The inference result returned is a **dictionary**, which contains:

- **"inference_time"**
- **"inference_device"**
- **"result"**

The returned result is a **JSON** object from the requests library, containing common fields such as masks, boxes, scores, etc.

Pretrained Models
~~~~~~~~~~~~~~~~~~~~~~~~~

You can also use the pretrained models provided by DaoAI World, which include: ``OCR``, ``auto_mask``, and ``mask_predictor``.

OCR
**********

The OCR model uses DaoAI's pretrained model to recognize text in images and return the results.

.. code-block:: python

    # import the inference-sdk
    from inference_client import InferenceHTTPClient
    # initialize the client
    CLIENT = InferenceHTTPClient(
    api_endpoint="https://api.dev.daoai.ca",
    api_key="XXXXXXXXXXXXXXXXXXX"
    )

    # Infer with a PRETRAINED MODEL (Eg, auto_segment)
    result = CLIENT.infer("YOUR_IMAGE.jpg", pretrained_model_type="ocr")

The returned result is a **JSON** object from the requests library, containing common fields such as masks, boxes, scores, etc.

auto_mask
**********

The auto_mask model is a **global** intelligent segmentation model provided by DaoAI World server. It accepts several parameters:

- points_per_side:
    This parameter controls the density of generated point hints. A larger value means denser points covering a wider area, helping the model capture more details. A smaller value may miss small or complex objects.

- box_nms_thresh:
    NMS (Non-Maximum Suppression) is a post-processing step to remove redundant detection boxes. The threshold ``box_nms_thresh`` determines the minimum IoU (Intersection over Union) required between two boxes to be considered different objects. A larger value makes NMS stricter, removing more redundant boxes. A smaller value keeps more boxes but may introduce more false positives.

- pred_iou_thresh:
    This parameter filters out small predicted segmentation regions. It represents the IoU between the predicted mask and the original input image. A higher value removes smaller masks, retaining only those with higher overlap. A lower value keeps more masks, but may include noise.

- min_mask_region_area:
    This directly controls the minimum area of the mask region. A larger value removes smaller masks, keeping larger ones.

- point_grids=None:
    This parameter specifies the grid for generating point hints. If set to None, the points are randomly generated. If a specific grid is provided, points are generated based on the grid, helping to generate denser points in certain areas.

.. code-block:: python

    # import the inference-sdk
    from inference_client import InferenceHTTPClient
    # initialize the client
    CLIENT = InferenceHTTPClient(
    api_endpoint="https://api.dev.daoai.ca",
    api_key="XXXXXXXXXXXXXXXXXXX"
    )

    # Infer with a PRETRAINED MODEL (Eg, auto_segment)
    result = CLIENT.infer("YOUR_IMAGE.jpg", pretrained_model_type="auto_mask",
        points_per_side=48,
        box_nms_thresh=0.7,
        pred_iou_thresh=0.7,
        min_mask_region_area=100,
        point_grids=None,
    )

Saving and visualizing inference results

Save the entire result as an image:

.. code-block:: python

    import base64
    from io import BytesIO
    from PIL import Image

    #save same image
    # Decode the base64 image
    img_data = base64.b64decode(result['image'])
    # Open the decoded image data with Pillow
    img = Image.open(BytesIO(img_data))
    # Save the image as PNG
    img.save('output.png', 'PNG')

Get one mask from the result and display:

.. code-block:: python

    import matplotlib.pyplot as plt
    from pycocotools import mask

    # access single mask
    rle_data = result['masks'][0]['segmentation']
    # decode coco RLE (Run-length encoding)
    binary_mask = mask.decode(rle_data)
    # display
    plt.imshow(binary_mask, cmap="gray")
    plt.axis("off")
    plt.show()

mask_predictor
******************

The mask_predictor model is a **guided** intelligent segmentation model provided by DaoAI World server, similar to smart labeling tools used during annotation. You can input guiding points or boxes, and the model will return the mask for the indicated region.
It accepts the following parameters:

  - **point_coords**:
    This parameter represents the coordinates of the guiding points. Specify some points on the image to tell the model which regions to focus on. These points can represent the centers or boundaries of the objects you're interested in. Typically, this is a 2D array where each row represents the coordinates (x, y) of a point.
    
  - **box**:
    This parameter represents a bounding box. It outlines the region of interest with a rectangle. This helps the model focus more quickly on the target area. Typically, this is a list or array with four elements: `[x1, y1, x2, y2]`, which represent the coordinates of the top-left and bottom-right corners of the box.

  - **point_labels**:
    This parameter represents the labels for the guiding points. Assign a label to each point to tell the model which category it belongs to. This is particularly useful in multi-class segmentation tasks. It is typically an array of the same length as `point_coords`, with each element representing the label of the corresponding point.

.. code-block:: python

    # import the inference-sdk
    from inference_client import InferenceHTTPClient
    # initialize the client
    CLIENT = InferenceHTTPClient(
    api_endpoint="https://api.dev.daoai.ca",
    api_key="XXXXXXXXXXXXXXXXXXX"
    )

.. code-block:: python

    # use Box
    x_min, y_min, x_max, y_max = 400, 400, 600, 600  # Replace with your coordinates
    box = [x_min, y_min, x_max, y_max]
    result = CLIENT.infer(file_path, pretrained_model_type="mask_predictor",
        point_coords=None,
        box=box,
        point_labels=None,
    )

.. code-block:: python

    # use Point
    point_coords = [[100, 50], [200, 150]]  # Replace with your coordinates
    point_labels = [0, 1]  # Assign the labels for the coordinates (0: background, 1: object)
    result = CLIENT.infer(file_path, pretrained_model_type="mask_predictor",
      point_coords=point_coords,
      box=None,
      point_labels=point_labels,
    )

Saving and visualizing inference results

.. code-block:: python

    import cv2
    from pycocotools import mask

    image = cv2.imread(file_path)
    image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

    # Decode coco RLE (Run-Length Encoding) into a binary mask
    binary_mask = mask.decode(result)
    
    # Find the minimum bounding box of the mask
    x_coords, y_coords = np.where(binary_mask > 0)
    if len(x_coords) == 0 or len(y_coords) == 0:
        print("No mask detected.")
        continue
    x_min, x_max = np.min(x_coords), np.max(x_coords)
    y_min, y_max = np.min(y_coords), np.max(y_coords)
    
    # Create an empty transparent background and crop according to the mask
    masked_image = np.zeros_like(image_rgb, dtype=np.uint8)
    masked_image = cv2.bitwise_and(image_rgb, image_rgb, mask=binary_mask)

    # Crop the image using the bounding box and remove the black border
    cropped_image = masked_image[x_min:x_max+1, y_min:y_max+1]

    # Save the cropped image
    current_time = datetime.now().strftime('%Y%m%d_%H%M%S')
    cut_out_name = f"cut_out_{current_time}_{i}.png"
    output_folder = "./"
    cut_out_path = os.path.join(output_folder, cut_out_name)
    cv2.imwrite(cut_out_path, cv2.cvtColor(cropped_image, cv2.COLOR_RGB2BGR))

    print("Image saved to:", cut_out_path)