Edge inference fee (postpaid)

Edge Inference provides GPU inference services based on EdgeOne edge nodes, allowing users to deploy custom model images or platform-preset models to edge nodes for inference. Edge Inference is only supported in the Enterprise Edition plan and adopts a postpaid billing method based on instance usage duration. With a single inference instance as the smallest billing unit, postpaid bills are generated based on the instance running duration.
Instance running duration: refers to the total time (in seconds) from the startup to the termination of an inference instance. Billing is calculated per second with a minimum charge of 1 second, and any fraction of a second will be rounded up to 1 second.
Note:
The Edge Inference feature is currently in beta testing and requires being added to the allowlist for access. It is only supported in the Enterprise Edition plan. If needed, please Contact Us.

Edge Inference Cost

Edge Inference is billed on a postpaid basis according to the running duration of instances with different GPU specifications and settled monthly. The billing method and pricing for postpaid are as follows:
Billable Item
GPU specifications
List price (USD/second)
Settlement Cycle
Custom Inference Service
Entry-level (Tier A)
0.000217 USD/second
Hour or month
Basic (Tier B)
0.000220 USD/second
Enhanced (Tier C)
0.000250 USD/second
Note:
1. Edge Inference is billed based on the actual running duration of instances. Billing stops as soon as the instance is stopped.
2. If AS is enabled, each instance is independently metered, and the total cost is the cumulative value of the running duration costs for all instances.

Example: Creating a Custom Inference Service Under the Enterprise Plan, Selecting an Entry-Level GPU (16 GB VRAM), and Manually Setting the Number of Instances to 1

Assume that the user starts the service at 10:00:00 on March 3. Between 10:00:00 - 11:00:00 on March 3, the instance runs continuously for 1 hour (3600 seconds). At 11:00:00 on March 3, the user adjusts the number of instances from 1 to 2. Both instances run continuously between 11:00:00 - 12:00:00 for 1 hour (3600 seconds).
The unit price of Entry-level GPU is 0.000217 USD/second, therefore the cost settlement for these two hours is as follows:
On March 3, 10:00:00 - 11:00:00, 1 instance was running, and the cost for this hour: 0.000217 × 3600 × 1 = 0.7812 USD.
On March 3, 11:00:00 - 12:00:00, 2 instances were running, and the cost for this hour: 0.000217 × 3600 × 2 = 1.5624 USD.
Total cost for two hours: 0.7812 + 1.5624 = 2.3436 USD.