Edge inference fee (postpaid)
Edge Inference provides GPU inference services based on EdgeOne edge nodes, allowing users to deploy custom model images or platform-preset models to edge nodes for inference. Edge Inference is only supported in the Enterprise Edition plan and adopts a postpaid billing method based on instance usage duration. With a single inference instance as the smallest billing unit, postpaid bills are generated based on the instance running duration.
Instance running duration: refers to the total time (in seconds) from the startup to the termination of an inference instance. Billing is calculated per second with a minimum charge of 1 second, and any fraction of a second will be rounded up to 1 second.
Note:
The Edge Inference feature is currently in beta testing and requires being added to the allowlist for access. It is only supported in the Enterprise Edition plan. If needed, please Contact Us.
Edge Inference Cost
Edge Inference is billed on a postpaid basis according to the running duration of instances with different GPU specifications and settled monthly. The billing method and pricing for postpaid are as follows:
Billable Item | GPU specifications | List price (USD/second) | Billing Modes | Settlement Cycle |
Custom Inference Service | Entry-level (A-tier) | 0.000217 USD/second | Postpaid | Month |
| Basic (B-tier) | 0.000220 USD/second | | |
| Basic Enhanced (C-tier) | 0.000250 USD/second | | |
Note:
1. Edge Inference is billed based on the actual running duration of instances. Billing stops as soon as the instance is stopped.
2. If AS is enabled, each instance is independently metered, and the total cost is the cumulative value of the running duration costs for all instances.
Billing example: A user creates a custom inference service under the Enterprise Edition plan, selects the Entry-level GPU (A-tier), and manually sets the instance count to 1
Assume that the user starts the service at 10:00:00 on March 3. Between 10:00:00 - 11:00:00 on March 3, the instance runs continuously for 1 hour (3600 seconds). At 11:00:00 on March 3, the user adjusts the number of instances from 1 to 2. Both instances run continuously between 11:00:00 - 12:00:00 for 1 hour (3600 seconds).
The unit price of Entry-level GPU is 0.002050 CNY/second, therefore the cost settlement for these two hours is as follows:
On March 3, 10:00:00 - 11:00:00, 1 instance was running, and the cost for this hour: 0.000217 × 3600 × 1 = 0.7812 USD.
On March 3, 11:00:00 - 12:00:00, 2 instances were running, and the cost for this hour: 0.000217 × 3600 × 2 = 1.5624 USD.
Total cost for two hours: 0.7812 + 1.5624 = 2.3436 USD.